donderdag 21 augustus 2008

Poisson distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume.
The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. A classic example is the nuclear decay of atoms.

Occurrence

The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete nature (that is, those that may happen 0, 1, 2, 3, ... times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples of events that may be modelled as a Poisson distribution include:

* The number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry. This example was made famous by a book of Ladislaus Josephovich Bortkiewicz (1868–1931).
* The number of phone calls at a call center per minute.
* The number of times a web server is accessed per minute.
* The number of mutations in a given stretch of DNA after a certain amount of radiation.

The "law of small numbers"

The word law is sometimes used as a synonym of probability distribution, and convergence in law means convergence in distribution. Accordingly, the Poisson distribution is sometimes called the law of small numbers because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen.


The work focused on certain random variables N that count, among other things, a number of discrete occurrences (sometimes called "arrivals") that take place during a time-interval of given length. If the expected number of occurrences in this interval is λ, then the probability that there are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, ...) is equal to


f of k and lambda is the ratio of lambda to the power of k times e to the power of minus lambda over the factorial of k.

or again

f of k and lambda is lambda to the power of k time e to the power of minus lambda divided by the factorial of k.



where

* e is the base of the natural logarithm
* k is the number of occurrences of an event - the probability of which is given by the function
* k! is the factorial of k
* λ is a positive real number, equal to the expected number of occurrences that occur during the given interval. For instance, if the events occur on average 4 times per minute, and you are interested in the number of events occurring in a 10 minute interval, you would use as model a Poisson distribution with λ = 10*4 = 40.

M/M/1 model

The M/M/1 is a single-server queue model, that can be used to approximate a lot of simple systems.
M/M/1 schema

Following Kendall's notation it indicates a system where:

* Arrivals are a Poisson process;
* Service time is exponentially distributed;
* There is one server.

Analysis

If the arrival rate is λ and the service rate is μ we can represent the system by a birth-death process, where each state is the number of users. Of course we have an infinite number of states: state 0 (no users in the system), state 1 (1 user), state 2 (two users), and so on. The birth rate is every time λ and the death rate is μ. In fact, regardless of the state, we can have only two events:

* A new user arrives. So if the system is in state k, it goes to state k + 1 with rate λ
* A user leaves the system. So if the system is in state k, it goes to state k − 1 (or k if k is 0) with rate μ

It's easy now to see that the system is stable only if λ < μ. In fact if the death rate is less than the arrival rate, the number of users in the queue will become infinite.

If the system is stable, we can obtain a lot of performance measures of interest like:

* The expected number of users in the system
* The expected number of users in the queue
* The throughput

Example

There are a lot of situations where an M/M/1 model could be used. For instance, you can take a post office with only one employee, and therefore one queue. The customers arrive, go into the queue, they are served, and they leave the system. If the arrival process is Poisson, and the service time is exponential, we can use an M/M/1 model. Hence, we can easily calculate the expected number of people in the queue, the probabilities they will have to wait for a particular length of time, and so on.

woensdag 20 augustus 2008

Jackson network

The concept of Jackson networks, named after James R. Jackson, is the first significant development in the theory of networks of queues, in which each node of the queueing network can be analyzed separately.

Definition

A network of m interconnected queues is known as a Jackson network if it meets the following conditions:

1. Customers arriving from outside the system arrive as a Poisson process.
2. The servers each act as a Poisson process (exponentially distributed service times).
3. A customer leaving queue i will either move to some new queue j with probability Pij or leave the system with probability 1 minus the sum over P of i-j from j is 1 to m. Those events are independent and identically distributed.
4. The utilization of all of the queues is less than one.

In such a network, Jackson's theorem applies and the distribution of customers in each queue when the system is in equilibrium is exactly the distribution of an M/M/1 queueing model with the same utilization.

dinsdag 19 augustus 2008

Scheduling algorithm

In computer science, a scheduling algorithm is the method by which threads, processes or data flows are given access to system resources (e.g. processor time, communications bandwidth). This is usually done to load balance a system effectively or achieve a target Quality of service. The need for a scheduling algorithm arises from the requirement for most modern systems to perform multitasking (execute more than one process at a time) and Multiplexing (transmit multiple flows simultaneously).

In computing and multitasking

The algorithm used may be as simple as round-robin in which each process is given equal time (for instance 1 ms, usually between 1 ms and 100 ms) in a cycling list. So, process A executes for 1 ms, then process B, then process C, then back to process A.

More advanced algorithms take into account process priority, or the importance of the process. This allows some processes to use more time than other processes. It should be noted that the kernel always uses whatever resources it needs to ensure proper functioning of the system, and so can be said to have infinite priority. In SMP systems, processor affinity is considered to increase overall system performance, even if it may cause a process itself to run more slowly. This generally improves performance by reducing cache thrashing.

In computer networks and multiplexing

In packet-switched computer networks and other statistical multiplexing, the notion of a scheduling algorithm is used as an alternative to first-come first-served queuing of data packets. In advanced packet radio wireless networks such as HSDPA 3.5G cellular system, channel-dependent scheduling may be used to take advantage of favourable channel conditions to increase the throughput and system spectral efficiency. The simplest best-effort scheduling algorithms are round-robin, fair queuing (a max-min fair scheduling algorithm), proportionally fair scheduling and maximum throughput. If differentiated or guaranteed quality of service is offered, as opposed to best-effort communication, weighted fair queuing may be utilized.

In computer I/O

Determines the order in which disk I/O requests are pushed to the disk device.

maandag 18 augustus 2008

TCP Vegas revisited

In 1994, Brakmo, O'Malley & Peterson presented a new TCP implementation called Vegas that achievesbetween 40% and 70% better throughput and one-fifth to one-half the losses,when compare with TCP Reno. Vegas is an alternativeimplementation that interoperates with any other valid implementation of TCP, and all changes are confined to the sending side. This is asummary of their work.Vegas - New retransmission mechanism
  1. In Reno, RTT and mean variance estimates arecomputed using a coarse-grained timer (around 500ms), meaning thatthe RTT estimate is not very accurate. This granularity influencealso how often TCP checks to see if it should time out on asegment. Under tests was found that for losses that resulted in a timeout, it took Reno an average of 1100ms from the timeit sent a segment that was lost, until it timed out and resent thesegment, whereas less that 300ms would have been the correcttimeout interval had a more accurate clock been used. Vegasthen fixes this problem using a finer coarse-grained timer.
     
  2. Vegas also extends retransmission mechanism as follows: the system clock is read and recorded each time a segment is sent; when anACK arrives, the clock is read again and the RTT calculationis done using this time and the timestamp recorded for the relevantsegment. Using this more accurate RTT, retransmission is decided asfollows:
    a.- When a dupack is received, Vegaschecks to see if the new RTT (current time - timestamprecorded) is greater than RTO. If it's, Vegas retransmits the segment without having to wait for the third dupack.In many cases third dupack is never received, and therefore, Reno would have to rely on the coarse-grained timeout to catch theloss.

    b.- When a non-dupack is received, if it is the first orsecond one after a retransmission, Vegas checks again to see ifRTT > RTO; if so, then the segment is retransmitted. This willcatch any other segment that may have lost previous to theretransmission without having to wait for a dupack. Inother words, Vegas treats the receipt of certain ACKs asa trigger to check if a timeout should happen; but still contains Reno's coarse-grained timeout code in case these mechanism fail torecognize a lost segment.
     

  3. Finally, in Reno, it is possible to decrease the congestionwindow more than once for losses ocurring during one RTT interval.In contrast, Vegas only decreases the congestion window if aretransmitted segment was sent after the last decrease. Any lossesthat happened before the last decrease are ignore, and therefore, do notimply that it should be decreased again. This change is needed because Vegas detects losses much sooner than Reno.
Vegas - New congestion avoidance mechanism
Reno's congestion detection uses theloss of segments as a signal of congestion. It has no mechanism to detectthe incipient stages of congestion - before losses occur - so theycan be prevented. Reno is reactive, rather than proactive, in thisrespect. Reno needs to create losses to find the availablebandwidth of the connection.
There are several proposed approaches for proactive congestion detection, for example:
  1. Wang & Crowcroft's DUAL algorithm is based on the increase ofthe RTT; every 2-RTT delays the algorithm checks to see ifthe current RTT is greater than the average of the minimum andmaximum RTTs seen so far; if so, the congestion window isdecreased by one-eighth.
     
  2. Jain's CARD adjusts the congestion window every 2-RTT delays based on the product (current cwnd-old cwnd) * (current RTT -old RTT); if the result is positive, decrease the window byone-eighth; if negative or zero, increase the window by one mss.This way the window will oscillate around its optimal point.
     
  3. Wang & Crowcroft's Tri-S scheme increase the window size by onesegment every RTT and then compare the throughput achieved to thethroughput when the window was one segment smaller; if the difference isless than one-half the throughput achieved when only one segment was intransit - as was the case at the beginning of the connection - theydecrease the window by one segment. Tri-S calculates thethroughput by dividing the number of bytes outstanding in the network bythe RTT.
Vegas implementation uses the idea to measure and control the amount of extra data that a connection has intransit, where extra data means data that would not have been sent ifthe bandwidth used by the connection exactly matches tha available bandwidthof the link. Vegas's goal is to maintain the "right" amount of extradata in the network. Obviously, if a connection is sending too much extradata, it will cause congestion; if it's sending too little extra data, itcannot respond rapidly enough to transient increase in the availablebandwidth.
This algorithm is not in effect during slow-start and its implementation is as follows:
  1. Define a given connection's BaseRTT to be the RTT of asegment when the connection is not congested; in practice it setsBaseRTT to the minimum of all measured RTTs; it is commonlythe RTT of the first segment sent by the connection, before therouter queues increase due to traffic generated by this connection. If weasume we are not overflowing the connection, the expected throughput isgiven by: Expected = WndowSize / BaseRTT, where WindowSizeis the size of the current congestion window, which we assume to be equalto the number of bytes outstanding.
     
  2. Calculate the current Actual sending rate recording how manybytes are transmitted between the time that a segment is sent and its ack is received and its RTT, and dividing the number of bytestransmitted by the sample RTT. This calculation is done once perround-trip time.
     
  3. Then compare Actual to Expected and adjust the windowaccordingly. Let Diff = Expected - Actual. Note that Diffis positive or zero by definition, since Actual > Expected impliesthat we have to change BaseRTT to the latest sample RTT.Also define two thresholds a and b, such that, a < b,roughly corresponding to having too little and too muchextra data in the network, respectively. When Diff < a, Vegasincreases the congestion window linearly during the next RTT, andwhen Diff > b, Vegas decrease the congestion window linearlyduring the next RTT. The congestion window is left unchangedwhen a < Diff < b.
Intuitively, the farther away the actualthroughput gets from the expected throughput, the more congestion there isin the network, which implies that the sending rate should be reduced. Theb threshold triggers this decrease. On the other hand, when theactual throughput rate gets too close to the expected throughput, theconnection is in danger of not utilizing the available bandwidth. The athreshold triggers this increase. The overall goal is to keep betweena and b extrabytes in the network.
Vegas - Modified slow-start mechanism
Reno's slow-start mechanism isvery expensive in terms of losses; since it doubles the size of thecongestion window every RTT while there are no losses - which isequivalent to doubling the attempted throughput every RTT -when it finally overruns the connection bandwidth, we can expect losses inthe order of half the current congestion window, more if we encounter aburst from another connection. Vegas tries to find a connection'savailable bandwidth that does not incur this kind of loss. Toward this end,the congestion detection mechanism is incorporated into slow-startwith only minor modifications.
To be able to detect and avoid congestionduring slow-start, Vegas allows exponential growth only every other RTT. In between, the congestion window stays fixedso a valid comparison of the expected and actual rates can be made. When theactual rate falls below the expected rate by a certain amount - the γ threshold - Vegas changesfrom slow-start mode to linear increase/decrease mode.
Discussion about Vegas TCP
Simulations show that if there are enoughbuffer in the routers - meaning that Vegas' congestion avoidancemechanisms can function effectively - a higher throughput and a fasterresponse time result. As the load increase and/or the number of routerbuffer decrease, Vegas' congestion avoidance mechanism are not aseffective, and Vegas starts to behave more like Reno. Underheavy congestion, Vegas behaves very similar to Reno,since Vegas fallback to Reno's coarse-grained timeoutmechanism.
The important point to keep in mind is that up to the point that congestion is bad enough for Vegas' behaviorto degenerate into Reno, Vegas is less aggressive inits use of router buffers than Reno. This is because Vegaslimits its use of router buffers as specified by the β threshold, whereas Renoincreases its window size until there are losses - which means allthe router buffers are being used.
Finally Vegas' congestion detectionalgorithm depends on the accurate value for BaseRTT. If ourestimate for the BaseRTT is too small, then the protocol's throughputwill stay below the available bandwidth; if it is too large, then it willoverrun the connection. Authors tell that their experience is that protocoldoes well with its current choice of BaseRTT. However, they plan tostudy this more carefully in the future.

woensdag 6 augustus 2008

Dynamic Host Configuration Protocol

From Wikipedia, the free encyclopedia

Dynamic Host Configuration Protocol (DHCP) is a protocol used by networked devices (clients) to obtain the parameters necessary for operation in an Internet Protocol network. This protocol reduces system administration workload, allowing devices to be added to the network with little or no manual configurations.

Applicability


Dynamic Host Configuration Protocol is a way to manage network parameter assignment from a single DHCP server, or a group of DHCP servers arranged in a fault-tolerant manner. Even in small networks, Dynamic Host Configuration Protocol is useful because it can make it easy to add new machines to the local network.

DHCP is also recommended even in the case of servers whose addresses rarely change, so that if a server needs to be readdressed (RFC2071), changes can be made in as few places as possible. For devices such as routers and firewalls, that should not use DHCP, it can be useful to put Trivial File Transfer Protocol (TFTP) or SSH servers on the same machine that runs DHCP, which also serves to centralize administration.

DHCP can be used to assign addresses directly to servers and desktop machines, and, through a Point-to-Point Protocol (PPP) proxy, to dialup and broadband on-demand hosts, as well as for residential Network address translation (NAT) gateways and routers. DHCP is generally not appropriate for infrastructure such as non-edge routers and DNS servers.


History
DHCP emerged as a standard protocol in October 1993 as defined in RFC 1531, succeeding the BOOTP protocol. The next update, RFC 2131 released in 1997 is the current DHCP definition. The latest proposed standard for DHCP over IPv6 (DHCPv6) can be found in RFC 3315.


Basic protocol operation
The Dynamic Host Configuration Protocol (DHCP) automates the assignment of IP addresses, subnet masks, default gateway, and other IP parameters. [1]

When a DHCP-configured client (be it a computer or any other network-aware device) connects to a network, the DHCP client sends a broadcast query requesting necessary information from a DHCP server. The DHCP server manages a pool of IP addresses and information about client configuration parameters such as the default gateway, the domain name, the DNS servers, other servers such as time servers, and so forth. Upon receipt of a valid request the server will assign the computer an IP address, a lease (the length of time for which the allocation is valid), and other IP configuration parameters, such as the subnet mask and the default gateway. The query is typically initiated immediately after booting and must be completed before the client can initiate IP-based communication with other hosts.

DHCP provides three modes for allocating IP addresses. The best-known mode is dynamic, in which the client is provided a "lease" on an IP address for a period of time. Depending on the stability of the network, this could range from hours (a wireless network at an airport) to months (for desktops in a wired lab). At any time before the lease expires, the DHCP client can request renewal of the lease on the current IP address. A properly-functioning client will use the renewal mechanism to maintain the same IP address throughout its connection to a single network, otherwise it may risk losing its lease while still connected, thus disrupting network connectivity while it renegotiates with the server for its original or a new IP address.

The two other modes for allocation of IP addresses are automatic (also known as DHCP Reservation), in which the address is permanently assigned to a client, and manual, in which the address is selected by the client (manually by the user or any other means) and the DHCP protocol messages are used to inform the server that the address has been allocated.

The automatic and manual methods are generally used when finer-grained control over IP address is required (typical of tight firewall setups), although typically a firewall will allow access to the range of IP addresses that can be dynamically allocated by the DHCP server.

The process of address allocation is known as ROSA. R-Request, O-Offer, S-Send, A-Accept.


Security
Having been standardized before network security became a significant issue, the basic DHCP protocol includes no security features, and is potentially vulnerable to two types of attacks:[2]

Unauthorized DHCP Servers: as you cannot specify the server you want, an unauthorized server can respond to client requests, sending client network configuration values that are beneficial to the attacker. As an example, a hacker can hijack the DHCP process to configure clients to use a malicious DNS server or router (see also DNS cache poisoning).
Unauthorized DHCP Clients: By masquerading as a legitimate client, an unauthorized client can gain access to network configuration and an IP address on a network it should otherwise not be allowed to use. Also, by flooding the DHCP server with requests for IP addresses, it is possible for an attacker to exhaust the pool of available IP addresses, disrupting normal network activity (a denial of service attack).
To combat these threats RFC 3118 ("Authentication for DHCP Messages") introduced authentication information into DHCP messages allowing clients and servers to reject information from invalid sources. Although support for this protocol is widespread, a large number of clients and servers still do not fully support authentication, thus forcing servers to support clients that do not support this feature. As a result, other security measures are usually implemented around the DHCP server (such as IPsec) to ensure that only authenticated clients and servers are granted access to the network.

Wherever possible, DHCP-assigned addresses should be dynamically linked to a secure DNS server, to allow troubleshooting by name rather than by a potentially unknown address. Effective DHCP-DNS linkage requires having a file of either MAC addresses or local names that will be sent to DNS that uniquely identifies physical hosts, IP addresses, and other parameters such as the default gateway, subnet mask, and IP addresses of DNS servers from a DHCP server. The DHCP server ensures that all IP addresses are unique, i.e., no IP address is assigned to a second client while the first client's assignment is valid (its lease has not expired). Thus IP address pool management is done by the server and not by a network administrator..


IP address allocation
Depending on implementation, the DHCP server may have three methods of allocating IP-addresses, plus a fourth mode of operation ("manual") in which the client (rather than the DHCP server) assigns an IP address. (WARNING--the terminology below differs from the terminology above in Basic Control Operation):

dynamic allocation
A network administrator assigns a range of IP addresses to DHCP, and each client computer on the LAN has its IP software configured to request an IP address from the DHCP server during network initialization. The request-and-grant process uses a lease concept with a controllable time period, allowing the DHCP server to reclaim (and then reallocate) IP addresses that are not renewed (dynamic re-use of IP addresses).
automatic allocation
The DHCP server permanently assigns a free IP address to a requesting client from the range defined by the administrator. This is like dynamic allocation, but the DHCP server keeps a table of past IP address assignments, so that it can preferentially assign to a client the same IP address that the client previously had.
static allocation
The DHCP server allocates an IP address based on a table with MAC address/IP address pairs, which are manually filled in (perhaps by a network administrator). Only requesting clients with a MAC address listed in this table will be allocated an IP address. This feature (which is not supported by all routers) is variously called "Static DHCP Assignment" (by DD-WRT), "fixed-address" (by the dhcpd documentation), "DHCP reservation" or "Static DHCP" (by Cisco/Linksys), and "IP reservation" or "MAC/IP binding" (by various other router manufacturers).
manual allocation: The DHCP server does not assign the IP address; instead, the client is configured with a user-specified static IP address.
Many DHCP servers can manage hosts by more than one of the above methods. For example, the known hosts on the network can be assigned an IP address based on their MAC address (static allocation) whereas "guest" computers (such as laptops via WiFi) are allocated a temporary IP address out of a pool compatible with the network to which they're attached (dynamic allocation).


DHCP and firewalls

Firewalls usually have to permit DHCP traffic explicitly. Specification of the DHCP client-server protocol describes several cases when packets must have the source address of 0x00000000 or the destination address of 0xffffffff. Anti-spoofing policy rules and tight inclusive firewalls often stop such packets. Multi-homed DHCP servers require special consideration and further complicated configuration.

To allow DHCP, network administrators need to allow several types of packets through the server-side firewall. All DHCP packets travel as UDP datagrams; all client-sent packets have source port 68 and destination port 67; all server-sent packets have source port 67 and destination port 68. For example, a server-side firewall should allow the following types of packets:

Incoming packets from 0.0.0.0 or dhcp-pool to dhcp-ip
Incoming packets from any address to 255.255.255.255
Outgoing packets from dhcp-ip to dhcp-pool or 255.255.255.255
where dhcp-ip represents any address configured on a DHCP server host and dhcp-pool stands for the pool from which a DHCP server assigns addresses to clients


Technical details


Schema of a typical DHCP sessionDHCP uses the same two IANA assigned ports as BOOTP: 67/udp for the server side, and 68/udp for the client side.

DHCP operations fall into four basic phases. These phases are IP discovery, IP lease offer, IP request, and IP lease acknowledgement.

After the client obtained an IP address, the client may start an address resolution (ARP) query to prevent IP conflicts caused by address pool overlapping of DHCP servers.


DHCP discovery
The client broadcasts on the physical subnet to find available servers. Network administrators can configure a local router to forward DHCP packets to a DHCP server on a different subnet. This client-implementation creates a UDP packet with the broadcast destination of 255.255.255.255 or subnet broadcast address.

A client can also request its last-known IP address (in the example below, 192.168.1.100). If the client is still in a network where this IP is valid, the server might grant the request. Otherwise, it depends whether the server is set up as authoritative or not. An authoritative server will deny the request, making the client ask for a new IP immediately. A non-authoritative server simply ignores the request, leading to an implementation-dependent timeout for the client to give up on the request and ask for a new IP address.


DHCP offers
When a DHCP server receives an IP lease request from a client, it extends an IP lease offer. This is done by reserving an IP address for the client and sending a DHCPOFFER message across the network to the client. This message contains the client's MAC address, followed by the IP address that the server is offering, the subnet mask, the lease duration, and the IP address of the DHCP server making the offer.

The server determines the configuration, based on the client's hardware address as specified in the CHADDR field. Here the server, 192.168.1.1, specifies the IP address in the YIADDR field.





DHCP acknowledgement
When the DHCP server receives the DHCPREQUEST message from the client, it initiates the final phase of the configuration process. This acknowledgement phase involves sending a DHCPACK packet to the client. This packet includes the lease duration and any other configuration information that the client might have requested. At this point, the IP configuration process is complete.

The server acknowledges the request and sends the acknowledgement to the client. The system as a whole expects the client to configure its network interface with the supplied options.

DHCP information
The client to the DHCP server: either to request more information than the server sent with the original DHCPACK; or to repeat data for a particular application - for example, browsers use DHCP Inform to obtain web proxy settings via WPAD. Such queries do not cause the DHCP server to refresh the IP expiry time in its database.

DHCP releasing
The client sends a request to the DHCP server to release the DHCP and the client unconfigures its IP address. As clients usually do not know when users may unplug them from the network, the protocol does not mandate the sending of DHCP Release.

Client configuration parameters
A DHCP server can provide optional configuration parameters to the client. RFC 2132 describes the available DHCP options defined by Internet Assigned Numbers Authority (IANA) - DHCP and BOOTP PARAMETERS.


Options
To identify the vendor and functionality of a DHCP client. The information is a variable-length string of characters or octets which has a meaning specified by the vendor of the DHCP client. One method that a DHCP client can utilize to communicate to the server that it is using a certain type of hardware or firmware is to set a value in its DHCP requests called the Vendor Class Identifier (VCI) (Option 60). This method allows a DHCP server to differentiate between the two kinds of client machines and process the requests from the two types of modems appropriately. Some types of set-top boxes also set the VCI (Option 60) to inform the DHCP server about the hardware type and functionality of the device. The value that this option is set to gives the DHCP server a hint about any required extra information that this client needs in a DHCP response.

dinsdag 5 augustus 2008

Network congestion avoidance

Network congestion avoidance is a process used in computer networks to avoid congestion.

The fundamental problem is that all network resources are limited, including router processing time and link throughput. Eg.:

todays (2006) Wireless LAN effective bandwidth throughput (15-100Mbit/s) is easily filled by a single personal computer.
Even on fast computer networks (e.g. 1 Gbit), the backbone can easily be congested by a few servers and client PCs.
Because P2P scales very well, file transmissions by P2P have no problem filling and will fill an uplink or some other network bottleneck, particularly when nearby peers are preferred over distant peers.
Denial of service attacks by botnets are capable of filling even the largest Internet backbone network links (40 Gbit/s as of 2007), generating large-scale network congestion
Implementations of connection-oriented protocols, such as the widely-used TCP protocol, generally watch for packet errors, losses, or delays (see Quality of Service) in order to adjust the transmit speed. There are many different network congestion avoidance processes, since there are a number of different trade-offs available. [1]

TCP/IP congestion avoidance
Main article: TCP congestion avoidance algorithm
The TCP congestion avoidance algorithm is the primary basis for congestion control in the Internet.

Problems occur when many concurrent TCP flows are experiencing port queue buffer tail-drops. Then TCP's automatic congestion avoidance is not enough. All flows that experience port queue buffer tail-drop will begin a TCP retrain at the same moment - this is called TCP global synchronization.


Purpose
"Recommendations on Queue Management and Congestion Avoidance in the Internet" (RFC 2309[7]) states that:

Fewer packets will be dropped with Active Queue Management (AQM).
The link utilization will increase because less TCP global synchronization will occur.
By keeping the average queue size small, queue management will reduce the delays and jitter seen by flows.
The connection bandwidth will be more equally shared among connection oriented flows, even without flow-based RED or WRED.

Random early detection
Main article: Random early detection
Main article: Weighted random early detection
One solution is to use random early detection (RED) on network equipments port queue buffer. [8] [9] On network equipment ports with more than one queue buffer, weighted random early detection (WRED) could be used if available.

RED indirectly signals to sender and receiver by deleting some packets, eg. when the average queue buffer lengths are more than eg. 50% (lower threshold) filled and deletes linearly more or (better according to paper) cubical more packets, [10] up to eg. 100% (higher threshold). The average queue buffer lengths are computed over 1 second at a time.


Flowbased-RED/WRED
Some network equipment are equipped with ports that can follow and measure each flow (flowbased-RED/WRED) and are hereby able to signal to a too big bandwidth flow according to some QoS policy. A policy could divide the bandwidth among all flows by some criteria.


IP ECN
Main article: Explicit Congestion Notification
Another approach is to use IP ECN[11]. ECN is only used when the two hosts signal that they want to use it. With this method, an ECN bit is used to signal that there is explicit congestion. This is better than the indirect packet delete congestion notification performed by the RED/WRED algorithms, but it requires explicit support by both hosts to be effective. [12] Some outdated or buggy network equipment drops packets with the ECN bit set, rather than ignoring the bit. More information on the status of ECN including the version required for Cisco IOS, by Sally Floyd[8], one of the authors of ECN.

When a router receives a packet marked as ECN capable and anticipates (using RED) congestion, it will set an ECN-flag notifying the sender of congestion. The sender then ought to decrease its transmission bandwidth; eg. by decreasing the tcp window size (sending rate) or by other means.


Cisco AQM: Dynamic buffer limiting (DBL)

Cisco has taken a step further in their Catalyst 4000 series with engine IV and V. Engine IV and V has the possibility to classify all flows in "aggressive" (bad) and "adaptive" (good). It ensures that no flows fill the port queues for a long time. DBL can utilize IP ECN instead of packet-delete-signalling. [13] [14]


TCP Window Shaping
Congestion avoidance can also efficiently be achieved by reducing the amount of traffic flowing into your network. When an application requests a large file, graphic or web page, it usually advertises a "window" of between 32K and 64K. This results in the server sending a full window of data (assuming the file is larger than the window). When you have many applications simultaneously requesting downloads, this data creates a congestion point at your upstream provider by flooding the queue much faster than it can be emptied. By using a device to reduce the window advertisement, the remote servers will send less data, thus reducing the congestion and allowing traffic to flow more freely. This technique can reduce congestion in a network by a factor of 40.