Please Write Detailed REVIEW On Following Topic. Topic Name: “Congestion Avoidance and Control”
Need 1 page of detailed review on above mentioned topic. It should have
1) Introduction
2) Theoretical and Empirical Details
3) Merits
4) De merits
5) Assumptions
6) Conclusion etc…
NOTE: All Headings font size should be 10( times new roman)
Review Guidelines:
1. One page, single space, margin 1 inch (top, bottom, left, right), font (times new roman, size 9) Congestion Avoidance and Control∗
Van Jacobson†
Lawrence Berkeley Laboratory
Michael J. Karels‡
University of California at Berkeley
November, 1988
Introduction
Computer networks have experienced an explosive growth over the past few years and with
that growth have come severe congestion problems. For example, it is now common to see
internet gateways drop 10% of the incoming packets because of local buffer overflows.
Our investigation of some of these problems has shown that much of the cause lies in
transport protocol implementations (not in the protocols themselves): The ‘obvious’ ways
to implement a window-based transport protocol can result in exactly the wrong behavior
in response to network congestion. We give examples of ‘wrong’ behavior and describe
some simple algorithms that can be used to make right things happen. The algorithms are
rooted in the idea of achieving network stability by forcing the transport connection to obey
a ‘packet conservation’ principle. We show how the algorithms derive from this principle
and what effect they have on traffic over congested networks.
In October of ’86, the Internet had the first of what became a series of ‘congestion col-
lapses’. During this period, the data throughput from LBL to UC Berkeley (sites separated
by 400 yards and two IMP hops) dropped from 32 Kbps to 40 bps. We were fascinated by
this sudden factor-of-thousand drop in bandwidth and embarked on an investigation of why
things had gotten so bad. In particular, we wondered if the 4.3BSD (Berkeley UNIX ) TCP
was mis-behaving or if it could be tuned to work better under abysmal network conditions.
The answer to both of these questions was “yes”.
∗Note: This is a very slightly revised version of a paper originally presented at SIGCOMM ’88 [12]. If you
wish to reference this work, please cite [12].
†This work was supported in part by the U.S. Department of Energy under Contract Number DE-AC03-
76SF00098.
‡This work was supported by the U.S. Department of Commerce, National Bureau of Standards, under
Grant Number 60NANB8D0830.
1
1 GETTING TO EQUILIBRIUM: SLOW-START 2
Since that time, we have put seven new algorithms into the 4BSD TCP:
(i) round-trip-time variance estimation
(ii) exponential retransmit timer backoff
(iii) slow-start
(iv) more aggressive receiver ack policy
(v) dynamic window sizing on congestion
(vi) Karn’s clamped retransmit backoff
(vii) fast retransmit
Our measurements and the reports of beta testers suggest that the final product is fairly good
at dealing with congested conditions on the Internet.
This paper is a brief description of (i) – (v) and the rationale behind them. (vi) is an algo-
rithm recently developed by Phil Karn of Bell Communications Research, described in [16].
(vii) is described in a soon-to-be-published RFC (ARPANET “Request for Comments”).
Algorithms (i) – (v) spring from one observation: The flow on aTCP connection (or
ISO TP-4 or Xerox NS SPPconnection) should obey a ‘conservation of packets’ principle.
And, if this principle were obeyed, congestion collapse would become the exception rather
than the rule. Thus congestion control involves finding places that violate conservation and
fixing them.
By ‘conservation of packets’ we mean that for a connection ‘in equilibrium’, i.e., run-
ning stably with a full window of data in transit, the packet flow is what a physicist would
call ‘conservative’: A new packet isn’t put into the network until an old packet leaves. The
physics of flow predicts that systems with this property should be robust in the face of
congestion.1 Observation of the Internet suggests that it was not particularly robust. Why
the discrepancy?
There are only three ways for packet conservation to fail:
1. The connection doesn’t get to equilibrium, or
2. A sender injects a new packet before an old packet has exited, or
3. The equilibrium can’t be reached because of resource limits along the path.
In the following sections, we treat each of these in turn.
1 Getting to Equilibrium: Slow-start
Failure (1) has to be from a connection that is either starting or restarting after a packet
loss. Another way to look at the conservation property is to say that the sender uses acks as
a ‘clock’ to strobe new packets into the network. Since the receiver can generate acks no
1A conservative flow means that for any given time, the integral of the packet density around the sender–
receiver–sender loop is a constant. Since packets have to ‘diffuse’ around this loop, the integral is sufficiently
continuous to be a Lyapunov function for the system. A constant function trivially meets the conditions for
Lyapunov stability so the system is stable and any superposition of such systems is stable. (See [3], chap. 11–
12 or [21], chap. 9 for excellent introductions to system stability theory.)
1 GETTING TO EQUILIBRIUM: SLOW-START 3
Figure 1:Window Flow Control ‘Self-clocking’
Pr
ArAs
Pb
ReceiverSender
Ab
This is a schematic representation of a sender and receiver on high bandwidth networks
connected by a slower, long-haul net. The sender is just starting and has shipped a win-
dow’s worth of packets, back-to-back. The ack for the first of those packets is about to
arrive back at the sender (the vertical line at the mouth of the lower left funnel).
The vertical dimension is bandwidth, the horizontal dimension is time. Each of the shaded
boxes is a packet. Bandwidth× Time= Bits so the area of each box is the packet size. The
number of bits doesn’t change as a packet goes through the network so a packet squeezed
into the smaller long-haul bandwidth must spread out in time. The timePb represents the
minimum packet spacing on the slowest link in the path (thebottleneck). As the packets
leave the bottleneck for the destination net, nothing changes the inter-packet interval so on
the receiver’s net packet spacingPr = Pb. If the receiver processing time is the same for
all packets, the spacing between acks on the receiver’s netAr = Pr = Pb. If the time slot
Pb was big enough for a packet, it’s big enough for an ack so the ack spacing is preserved
along the return path. Thus the ack spacing on the sender’s netAs = Pb.
So, if packets after the first burst are sent only in response to an ack, the sender’s packet
spacing will exactly match the packet time on the slowest link in the path.
faster than data packets can get through the network, the protocol is ‘self clocking’ (fig. 1).
Self clocking systems automatically adjust to bandwidth and delay variations and have a
wide dynamic range (important considering thatTCP spans a range from 800 Mbps Cray
channels to 1200 bps packet radio links). But the same thing that makes a self-clocked
system stable when it’s running makes it hard to start — to get data flowing there must be
acks to clock out packets but to get acks there must be data flowing.
To start the ‘clock’, we developed aslow-start algorithm to gradually increase the
amount of data in-transit.2 Although we flatter ourselves that the design of this algorithm is
rather subtle, the implementation is trivial — one new state variable and three lines of code
in the sender:
2Slow-start is quite similar to theCUTE algorithm described in [14]. We didn’t know aboutCUTE at the time
we were developing slow-start but we should have—CUTE preceded our work by several months.
When describing our algorithm at the Feb., 1987, Internet Engineering Task Force (IETF) meeting, we called
it soft-start, a reference to an electronics engineer’s technique to limit in-rush current. The nameslow-startwas
coined by John Nagle in a message to the IETF mailing list in March, ’87. This name was clearly superior to
ours and we promptly adopted it.
1 GETTING TO EQUILIBRIUM: SLOW-START 4
Figure 2:The Chronology of a Slow-start
1
2
3
1
One Round Trip Time
0R
1R
2
4
5
3
6
7
2R
4
8
9
5
10
11
6
12
13
7
14
15
3R
One Packet Time
The horizontal direction is time. The continuous time line has been chopped into one-
round-trip-time pieces stacked vertically with increasing time going down the page. The
grey, numbered boxes are packets. The white numbered boxes are the corresponding acks.
As each ack arrives, two packets are generated: one for the ack (the ack says a packet has
left the system so a new packet is added to take its place) and one because an ack opens
the congestion window by one packet. It may be clear from the figure why an add-one-
packet-to-window policy opens the window exponentially in time.
If the local net is much faster than the long haul net, the ack’s two packets arrive at the
bottleneck at essentially the same time. These two packets are shown stacked on top of one
another (indicating that one of them would have to occupy space in the gateway’s outbound
queue). Thus the short-term queue demand on the gateway is increasing exponentially and
opening a window of sizeW packets will requireW/2 packets of buffer capacity at the
bottleneck.
• Add acongestion window, cwnd, to the per-connection state.
• When starting or restarting after a loss, set cwnd to one packet.
• On each ack for new data, increase cwnd by one packet.
• When sending, send the minimum of the receiver’s advertised window and cwnd.
Actually, the slow-start window increase isn’t that slow: it takes timeRlog2W whereR
is the round-trip-time andW is the window size in packets (fig. 2). This means the window
opens quickly enough to have a negligible effect on performance, even on links with a
large bandwidth–delay product. And the algorithm guarantees that a connection will source
data at a rate at most twice the maximum possible on the path. Without slow-start, by
contrast, when 10 Mbps Ethernet hosts talk over the 56 Kbps Arpanet via IP gateways, the
2 CONSERVATION AT EQUILIBRIUM: ROUND-TRIP TIMING 5
Figure 3:Startup behavior of TCP without Slow-start
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
••
• •
•••
••
••
••
••
•
••
••
••
••
••
•••
••
••
••
••
••
••
••
••
••
• •
•••
••
••
• •
••
••
••
••
••
•
• •
••
••
••
••
••
•••
••
••
••
••
••
••
••
••
••
•
••
••
••
••
••
••
••
••
••
••
••
••
••
•••
••
••
••
••
••
••
••
••
••
••
•••
•
•••
••
••
••
••
••
••
••
••
••
••
•••
••
••
• •
••
••
••
••
••
••
••
••
••
••
•••
••
•••
••
•
•
••
••
••
••
••
••
••
••
••
••
••
•
•••
••
••
••
••
••
••
••
••
••
•••
••
••
Send Time (sec)
P
a
ck
e
t
S
e
q
u
e
n
ce
N
u
m
b
e
r
(K
B
)
0 2 4 6 8 10
0
1
0
2
0
3
0
4
0
5
0
6
0
7
0
Trace data of the start of aTCP conversation between two Sun 3/50s running SunOS 3.5
(the 4.3BSD TCP). The two Suns were on different Ethernets connected by IP gateways
driving a 230.4 Kbps point-to-point link (essentially the setup shown in fig. 7). The win-
dow size for the connection was 16KB (32 512-byte packets) and there were 30 packets of
buffer available at the bottleneck gateway. The actual path contains six store-and-forward
hops so the pipe plus gateway queue has enough capacity for a full window but the gateway
queue alone does not.
Each dot is a 512 data-byte packet. The x-axis is the time the packet was sent. The y-
axis is the sequence number in the packet header. Thus a vertical array of dots indicate
back-to-back packets and two dots with the same y but different x indicate a retransmit.
‘Desirable’ behavior on this graph would be a relatively smooth line of dots extending
diagonally from the lower left to the upper right. The slope of this line would equal the
available bandwidth. Nothing in this trace resembles desirable behavior.
The dashed line shows the 20 KBps bandwidth available for this connection. Only 35%
of this bandwidth was used; the rest was wasted on retransmits. Almost everything is
retransmitted at least once and data from 54 to 58 KB is sent five times.
first-hop gateway sees a burst of eight packets delivered at 200 times the path bandwidth.
This burst of packets often puts the connection into a persistent failure mode of continuous
retransmissions (figures 3 and 4).
2 Conservation at equilibrium: round-trip timing
Once data is flowing reliably, problems (2) and (3) should be addressed. Assuming that
the protocol implementation is correct, (2) must represent a failure of sender’s retransmit
timer. A good round trip time estimator, the core of the retransmit timer, is the single most
2 CONSERVATION AT EQUILIBRIUM: ROUND-TRIP TIMING 6
Figure 4:Startup behavior of TCP with Slow-start
• •• ••
• ••••
• ••••
•••• •
•••••
•••••
•••••
•••••
•••••
•• •••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•• •••
• ••••
•••••
•••••
•••••
•••••
••• • •
•••••
•••• •
•••••
•••••
•••••
•••••
•••••
•••••
• ••••
•••••
•••••
•••••
•••••
•••••
••• ••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
•••••
• ••• •
•••••
• ••••
•••••
•••••
•••••
•••
Send Time (sec)
P
a
ck
e
t
S
e
q
u
e
n
ce
N
u
m
b
e
r
(K
B
)
0 2 4 6 8 10
0
2
0
4
0
6
0
8
0
1
0
0
1
2
0
1
4
0
1
6
0
Same conditions as the previous figure (same time of day, same Suns, same network path,
same buffer and window sizes), except the machines were running the 4.3+TCP with slow-
start. No bandwidth is wasted on retransmits but two seconds is spent on the slow-start
so the effective bandwidth of this part of the trace is 16 KBps — two times better than
figure 3. (This is slightly misleading: Unlike the previous figure, the slope of the trace is
20 KBps and the effect of the 2 second offset decreases as the trace lengthens. E.g., if this
trace had run a minute, the effective bandwidth would have been 19 KBps. The effective
bandwidth without slow-start stays at 7 KBps no matter how long the trace.)
important feature of any protocol implementation that expects to survive heavy load. And
it is frequently botched ([26] and [13] describe typical problems).
One mistake is not estimating the variation,σR, of the round trip time,R. From queuing
theory we know thatR and the variation inR increase quickly with load. If the load isρ
(the ratio of average arrival rate to average departure rate),R andσR scale like(1−ρ)−1.
To make this concrete, if the network is running at 75% of capacity, as the Arpanet was in
last April’s collapse, one should expect round-trip-time to vary by a factor of sixteen (−2σ
to +2σ).
TheTCP protocol specification[2] suggests estimating mean round trip time via the low-
pass filter
R← αR+ (1−α)M
whereR is the averageRTT estimate,M is a round trip time measurement from the most
recently acked data packet, andα is a filter gain constant with a suggested value of 0.9.
Once theR estimate is updated, the retransmit timeout interval,rto, for the next packet sent
is set toβR.
2 CONSERVATION AT EQUILIBRIUM: ROUND-TRIP TIMING 7
Figure 5:Performance of an RFC793 retransmit timer
•
• • • •
•
•
• •
• •
• •
•
•
• •
•
•
•
•
• • • • •
•
•
•
•
•
• • • •
• •
•
•
•
• • • • • •
•
•
• •
•
• •
•
•
• •
•
•
•
•
•
•
• •
•
•
• • •
•
• •
•
•
•
• •
•
• •
•
• •
•
•
•
•
• • •
• •
•
•
•
•
•
•
• •
Packet
R
T
T
(
se
c.
)
0 10 20 30 40 50 60 70 80 90 100 110
0
2
4
6
8
1
0
1
2
Trace data showing per-packet round trip time on a well-behaved Arpanet connection. The
x-axis is the packet number (packets were numbered sequentially, starting with one) and
the y-axis is the elapsed time from the send of the packet to the sender’s receipt of its ack.
During this portion of the trace, no packets were dropped or retransmitted.
The packets are indicated by a dot. A dashed line connects them to make the sequence eas-
ier to follow. The solid line shows the behavior of a retransmit timer computed according
to the rules of RFC793.
The parameterβ accounts forRTT variation (see [5], section 5). The suggestedβ = 2
can adapt to loads of at most 30%. Above this point, a connection will respond to load
increases by retransmitting packets that have only been delayed in transit. This forces the
network to do useless work, wasting bandwidth on duplicates of packets that will eventually
be delivered, at a time when it’s known to be having trouble with useful work. I.e., this is
the network equivalent of pouring gasoline on a fire.
We developed a cheap method for estimating variation (see appendix A)3 and the re-
sulting retransmit timer essentially eliminates spurious retransmissions. A pleasant side
effect of estimatingβ rather than using a fixed value is that low load as well as high load
performance improves, particularly over high delay paths such as satellite links (figures 5
and 6).
Another timer mistake is in the backoff after a retransmit: If a packet has to be retrans-
mitted more than once, how should the retransmits be spaced? For a transport endpoint
embedded in a network of unknown topology and with an unknown, unknowable and con-
stantly changing population of competing conversations, only one scheme has any hope
of working—exponential backoff—but a proof of this is beyond the scope of this paper.4
3We are far from the first to recognize that transport needs to estimate both mean and variation. See, for
example, [6]. But we do think our estimator is simpler than most.
4See [8]. Several authors have shown that backoffs ‘slower’ than exponential are stable given finite popula-
tions and knowledge of the global traffic. However, [17] shows that nothing slower than exponential behavior
will work in the general case. To feed your intuition, consider that an IP gateway has essentially the same
behavior as the ‘ether’ in an ALOHA net or Ethernet. Justifying exponential retransmit backoff is the same as
3 ADAPTING TO THE PATH: CONGESTION AVOIDANCE 8
Figure 6:Performance of a Mean+Variance retransmit timer
•
• • • •
•
•
• •
• •
• •
•
•
• •
•
•
•
•
• • • • •
•
•
•
•
•
• • • •
• •
•
•
•
• • • • • •
•
•
• •
•
• •
•
•
• •
•
•
•
•
•
•
• •
•
•
• • •
•
• •
•
•
•
• •
•
• •
•
• •
•
•
•
•
• • •
• •
•
•
•
•
•
•
• •
Packet
R
T
T
(
se
c.
)
0 10 20 30 40 50 60 70 80 90 100 110
0
2
4
6
8
1
0
1
2
Same data as above but the solid line shows a retransmit timer computed according to the
algorithm in appendix A.
To finesse a proof, note that a network is, to a very good approximation, a linear system.
That is, it is composed of elements that behave like linear operators — integrators, delays,
gain stages, etc. Linear system theory says that if a system is stable, the stability is expo-
nential. This suggests that an unstable system (a network subject to random load shocks
and prone to congestive collapse5) can be stabilized by adding some exponential damping
(exponential timer backoff) to its primary excitation (senders, traffic sources).
3 Adapting to the path: congestion avoidance
If the timers are in good shape, it is possible to state with some confidence that a timeout in-
dicates a lost packet and not a broken timer. At this point, something can be done about (3).
Packets get lost for two reasons: they are damaged in transit, or the network is congested
and somewhere on the path there was insufficient buffer capacity. On most network paths,
loss due to damage is rare (�1%) so it is probable that a packet loss is due to congestion in
the network.6
showing that no collision backoff slower than an exponential will guarantee stability on an Ethernet. Unfortu-
nately, with an infinite user population even exponential backoff won’t guarantee stability (although it ‘almost’
does—see [1]). Fortunately, we don’t (yet) have to deal with an infinite user population.
5The phrasecongestion collapse(describing a positive feedback instability due to poor retransmit timers) is
again the coinage of John Nagle, this time from [23].
6Because a packet loss empties the window, the throughput of any window flow control protocol is quite
sensitive to damage loss. For an RFC793 standard TCP running with windoww (where w is at most the
bandwidth-delay product), a loss probability ofp degrades throughput by a factor of(1+ 2pw)−1. E.g., a 1%
damage loss rate on an Arpanet path (8 packet window) degradesTCP throughput by 14%.
The congestion control scheme we propose is insensitive to damage loss until the loss rate is on the order of
the window equilibration length (the number of packets it takes the window to regain its original size after a
loss). If the pre-loss size isw, equilibration takes roughlyw2/3 packets so, for the Arpanet, the loss sensitivity
3 ADAPTING TO THE PATH: CONGESTION AVOIDANCE 9
A ‘congestion avoidance’ strategy, such as the one proposed in [15], will have two
components: The network must be able to signal the transport endpoints that congestion
is occurring (or about to occur). And the endpoints must have a policy that decreases
utilization if this signal is received and increases utilization if the signal isn’t received.
If packet loss is (almost) always due to congestion and if a timeout is (almost) always
due to a lost packet, we have a good candidate for the ‘network is congested’ signal. Partic-
ularly since this signal is delivered automatically by all existing networks, without special
modification (as opposed to [15] which requires a new bit in the packet headers and a mod-
ification toall existing gateways to set this bit).
The other part of a congestion avoidance strategy, the endnode action, is almost identical
in the DEC/ISO scheme and ourTCP7 and follows directly from a first-order time-series
model of the network:8 Say network load is measured by average queue length over fixed
intervals of some appropriate length (something near the round trip time). IfLi is the load at
interval i, an uncongested network can be modeled by sayingLi changes slowly compared
to the sampling time. I.e.,
Li = N
(N constant). If the network is subject to congestion, this zeroth order model breaks down.
The average queue length becomes the sum of two terms, theN above that accounts for the
average arrival rate of new traffic and intrinsic delay, and a new term that accounts for the
fraction of traffic left over from the last time interval and the effect of this left-over traffic
(e.g., induced retransmits):
Li = N + γLi−1
(These are the first two terms in a Taylor series expansion ofL(t). There is reason to believe
one might eventually need a three term, second order model, but not until the Internet has
grown substantially.)
When the network is congested,γ must be large and the queue lengths will start in-
creasing exponentially.9 The system will stabilize only if the traffic sources throttle back at
least as quickly as the queues are growing. Since a source controls load in a window-based
protocol by adjusting the size of the window,W, we end up with the sender policy
On congestion:
Wi = dWi−1 (d < 1)
I.e., a multiplicative decrease of the window size (which becomes an exponential decrease
over time if the congestion persists).
threshold is about 5%. At this high loss rate, the empty window effect described above has already degraded
throughput by 44% and the additional degradation from the congestion avoidance window shrinking is the least
of one’s problems.
We are concerned that the congestion control noise sensitivity is quadratic inw but it will take at least another
generation of network evolution to reach window sizes where this will be significant. If experience shows this
sensitivity to be a liability, a trivial modification to the algorithm makes it …
LDR 3302-21.01.01-1A24-S1, Organizational Theory and Behavior Unit III Essay Top of Form Bottom of Form…
Chapter 9 What are teratogens? Give 5 examples. Define each of these stages: Germinal, embryonic,…
You are a Financial Analyst that has been appointed to lead a team in the…
You are familiar with the ANA Code of Ethics and have a growing understanding of…
This week’s discussion will focus on management decision-making and control in two companies, American corporation…
Mary Rowlandson felt that the man who eventually came to own her, Quinnapin, was “the…