THE DESIGN, ANALYSIS AND MANAGEMENT OF HIGH SPEED PACKET SWITCHED NETWORKS

Part I - Summary of the Project

Advanced network technologies, such as SONET, ATM, Fast Ethernet, and the high-speed Internet backbone, have been considered to transport multimedia services. One of challenges of delivering the multimedia content is to meet the quality of service (QoS) requirement of each individual connection. Due to the large bandwidth-delay products involved, the heterogeneous traffic types that need to be supported, and the various QoS requirements imposed on the networks, network control and management becomes more difficult for network designers. For example, real-time traffic requires small end-to-end delay and jitter, while data transfer requires low cell loss rates to reduce the number of retransmissions and prevent the network from being more congested. Thus, it is necessary to control user's traffic so that network resources can be shared effectively and the QoS can be met for all users. In this work we have designed and analyzed various buffer management approaches to provide QOS guarantees. We also report on our work on a scalable multicast switch.

Part II - Technical Information

The first set of tasks were performed by R. Boorstyn and H. J. Chao over the three year duration of the grant. The remaining tasks were performed by T. T. Lee in the first year of the grant.

(1) Design and Implementation of Abacus Switch: A Scalable Multicast ATM Switch (Chao)

We proposed a new architecture for a multicast ATM switch scalable from a few tens to a few thousands of input ports. The switch, called Abacus switch, has a nonblocking switch fabric followed by small switch modules at the output ports. It has buffers at input and output ports. Cell replication, cell routing, output contention resolution, and cell addressing are all performed in a distributed way so that it can be scaled up to thousands of input and output ports. A novel algorithm has been proposed to resolve output port contention while achieving input buffers sharing, fairness among the input ports, and call splitting for multicasting. The channel grouping mechanism is also adopted in the switch to reduce the hardware complexity and improve the switch's throughput, while the cell sequence integrity is preserved. The switch can also handle multiple priority traffic by routing cells according to their priority levels. The performance study of the Abacus switch in throughput, average cell delay, and cell loss rate is presented. A key ASIC chip for building the Abacus switch, called the ARC (ATM Routing and Concentration) chip, contains a two-dimensional array (32x32) of switch elements that are arranged in a cross-bar structure. It provides the flexibility of configuring the chip into different group sizes to accommodate different ATM switch sizes. The ARC chip has been designed and fabricated using 0.8-micron CMOS technology and tested to operate correctly at 240 MHz.

(2) Design of a Generalized Priority Queue Manager for ATM Switches (Chao)

Meeting QoS requirements for various services in ATM networks has been very challenging for network designers. Various control techniques at either the call or cell level have been proposed. We consider cell transmission scheduling and discarding at the output buffers of an ATM switch. We propose a Generalized Priority Queue Manager (GPQM) that uses per virtual connection queueing to support multiple QoS requirements and achieve fairness in both cell transmission and discarding. It achieves the ultimate goal of guaranteeing the QoS requirement for each connection. The GPQM adopts the Earliest-Due-Date (EDD) and Self-Clocked Fair Queueing (SCFQ) schemes for scheduling cell transmission and a new Self-Calibrating Pushout (SCP) scheme for discarding cells. The GPQM's performance in cell loss rate and delay is presented. An implementation architecture for the GPQM is also proposed. It is facilitated by a new VLSI chip called the Priority Content Addressable Memory (PCAM) chip. The chip is being designed using CMOS 0.8-micron technology.

(3) Design and Analysis of a New Cell Discarding Scheme: Self-Calibrating Pushout (Chao)

We proposed a new cell discarding scheme, Self-Calibrating Pushout (SCP), to balance the cell loss rates by measuring, in real time, the numbers of discarded cells from low and high priorities. The primary advantage of the SCP scheme over the other proposed cell discarding schemes is its capability of automatically calibrating in real time each service class's cell loss rate so as to meet each class's QoS requirement, i.e., it is "self-calibrating". We use the fluid flow model to approximately analyze the SCP scheme with two loss priorities. Different cases depending on the buffer contents and counter values are considered. The resulting equations are partial differential equations, which are difficult to solve. Reasonable assumptions are made to convert the partial differential equations into ordinary differential equations, where solutions become feasible. The boundary conditions are also addressed to complete the solutions.

(4) Multiple Delay Bounds in an ATM Switch with Generalized Processor Sharing Discipline (Chao)

We propose a service scheduling scheme that is capable of providing multiple delay bounds for constrained arrivals in the ATM network. Our design goal is to preserve fair resource sharing within a class while supporting different delay bounds for different classes through proper bandwidth allocation. Our scheme is based on the framework of the Generalized Processor Sharing (GPS) discipline, in which minimum throughput is guaranteed for all sessions. Enhanced resource utilization is achieved through systematic analysis of the bandwidth sharing property by using so-called service curves. The analysis enables us to provide a multiple delay bounds guarantee under a simple schedulability test condition, which can be easily adopted in call admission control for a deterministic delay service system.

(5) Call Admission Control with Heterogeneous Traffic (Boorstyn)

We have developed a call admission procedure for heterogeneous collections of homogeneous sources. We consider only independent on-off fluid flow sources with exponential on and off times. Let P(n; S; C) be the probability of exceeding a buffer threshold for n homogeneous sources of type S and capacity C. If n and S are vectors, then we are dealing with a heterogeneous collection of ni homogeneous sources of type Si, i = 1,2,... . Let Pd be the desired overflow probability.

For homogeneous sources, we find n, such that P(n; S; C) = Pd. We define R(C) = C/n as the required capacity per source when the channel is completely filled with homogeneous sources. R(C) is monotonically decreasing and represents multiplexing gain. For heterogeneous sources we admit the combination if

Here Xi is the set of source types that are not pairwise assignable with and are smoother than i. Two source types are pairwise assignable if P(n1,n2; Sl,S2;C) < Pd for all n1 and n2 such that

n1 R1(C) + n2 R2(C) < C. A source type Sl is considered smoother than S2 if

P(l; Sl; R1 (C)) < P(1; S2; R2(C)). For a wide range of parameter values, this test has always been found to be successful and moderately conservative. Heuristic versions of this algorithm can be implemented on-line.

An intriguing observation is that most source types belong to one of two classes. In one class the sources have the characteristics of bursty sources and exhibit significant multiplexing gain. The other class consists of smoother sources that exhibit little multiplexing gain. if all sources were of one class, then the simple rule that sums ni Ri(C) would work. With both classes, simplification of our rule would be to let Xi be the set of smoother source types. Then for call admission purposes we are essentially splitting the channel in two - one part for smoother sources, the other for burstier sources. Call admission is easy since the capacities in each group are additive. The multiplexing gain for bursty sources is preserved. Note that all sources share the entire buffer.

If we can extend this behavior to include delay objectives and multiple priorities, then the sources could be naturally segregated into groups for assigning capacity.

(6) Analysis of Selective Cell Discarding Strategies in ATM Networks (Boorstyn)

A problem in sending TCP traffic over ATM networks is that the loss of even one ATM cell corrupts the whole TCP packet. In order to resolve such a problem, it has been proposed that an ATM switch intelligently discard selective cells when congested. Among these schemes are Early Packet Discard (EPD) and Partial Packet Discard (PPD). We have analyzed a simple EPD and PPD model in a single ATM switch using fluid flow traffic.We obtain the joint distributions of queue length and source state. The exact packet loss probability is obtained in a PPD system. We further obtain upper and lower bounds on packet loss probability in an EPD system and compare the performance of EPD to that of an uncontrolled system. We also demonstrate the optimum value of the EPD threshold for many different system parameters.

We conclude that the most important parameter in determining performance characteristics is intensity, which is defined as the traffic load relative to output link capacity. With most of the parameters we have examined, the packet loss probability that EPD or PPD can achieve is about half of that without control. When intensity is low, the effect of discarding schemes is not prominent because packet loss probabilities of systems with and without control are of the same order of magnitude. When intensity is high, however, discarding schemes significantly help reduce packet loss probability. The optimum threshold value for EPD monotonically decreases as the intensity increases. It is shown that when intensity is high, optimum performance is achieved for a wide range of EPD thresholds.

As a complementary measure of performance, we propose efficiency which is the ratio of successful data transmission rate to output link capacity. It is observed that when the intensity is high EPD and PPD can achieve near perfect efficiency, whereas the uncontrolled system has a very low efficiency. We conclude that EPD is robust, i.e., not sensitive to selection of threshold, and, more importantly, yields high efficiency.

(7) Call Acceptance of Multiple Loss and Delay Priorities (Boorstyn)

In high speed networks, to avoid congestion while maintaining a satisfactory utilization, an effective call admission is necessary. This task becomes more complex when the network needs to support varying QoS requirements. Here we consider sources with different loss and delay requirements entering a system with shared buffer B and capacity C. The input traffic may be may be a mixture of layered video, real-time data and leaky bucket-shaped traffic. Managing call admission for the real system can be difficult because of the complex scheduling and admittance mechanisms that may be implemented. We simplify call admission by creating a virtual system which partitions system resources by priority. Using this system we evaluate performance in the real system. Therefore, if a source is accepted (rejected) in the partitioned system, it is accepted (rejected) in the real system.

In the partitioned system we group sources with the same delay priority in one queue with a dedicated buffer size and capacity value. To maintain the various loss priorities within the queue we use thresholds. We match the input streams of the real system to exponentially distributed on-off sources. In the on state a source can produce several loss priority streams simultaneously. Within a single queue, we use large buffer approximations and replace average cell loss with probability of buffer overflow. We can find a simple relation between the losses, the capacity and the buffer size using asymptotic results. We also approximate the relation between the delay bound and capacity in a queue. Given these relationships we can quickly estimate the acceptance of a new call. When a new source requests admittance into the system, we determine the minimum capacity and queue sizes in the partitioned system. If the system resources are adequate, the new source is accepted. If system buffer is not adequate but there is system capacity to spare, we increase the capacities of the queues in such a way that the largest reduction in total used buffer is accomplished with slightest increase in total capacity.

We have compared our algorithm's results to the exact analysis of the partitioned system and also to a real system which uses shared resources. Initial results indicate the algorithm to be moderately conservative.

(8) Buffer Management Using Loss Bounds (Boorstyn)

We investigate providing QoS loss guarantees for multiple priority traffic in several buffer management schemes such as Strict Priority (SP), Selective Blocking (SB), PushOut (PO), and a more complicated system such as Self Calibrating Pushout (SCP). Consider two priority classes - low and high. The total loss is the weighted average of the individual losses and is invariant for the above work-conserving schemes. Thus, if we can determine the high priority loss, the low priority loss can also be determined. To find loss we use a large buffer approximation which gives moderately conservative results for capacity and buffer size, but is often very inaccurate in predicting loss. We have developed a better approximate method to compute loss efficiently and accurately. It is relatively easy to find high priority loss for SP and SB. We use these losses to find bounds on losses for PO and SCP. The resulting call admission algorithm performs well.

(9) Throughput Analysis of Nonblocking Switches (Lee)

One of the open problems in packet switching is to find closed-form solutions of throughput limits for non-blocking switches. In this research, we derive two simple closed-form formulas to approximate the maximum throughput of a switch module. The first formula is the throughput of a switch with look-ahead contention resolution algorithm. We proved that the asymptotic throughput is an exponential function of the look-ahead window size. For the switch module with output space expansion, we obtained another closed-form formula by diffusion approximation.

(10) Dynamic Throttling and Multiplexing of Bursty Traffic for ATM Virtual Channel (Lee)

One of the major advantages of ATM for information transportation is the efficiency gained by statistical multiplexing of bursty traffic generated from different sources, such as interactive data and compressed video image. Since the dynamic nature of bursty calls, flow and congestion control strategies become extremely critical. The objective of the proposed research was to investigate the role that feedback can play at both call level and cell level control. A common class of regulators are the so called leaky bucket regulators. In its simplest form it comprises a token generator which produces tokens at a fixed rate and which are allowed to collect in a finite buffer. Any incoming traffic cell can enter the network if there is a token available in the token buffer, otherwise it must be queued or dropped depending on the nature of the traffic. This policing is done via an access regulator situated between the end terminal and the network. It has been recognized that a conservative scheme such as the one outlined above can only yield modest utilizations. Therefore, if higher utilizations are desired then one must be willing to take higher risks of packet loss and delay. Keeping this in mind, we study the dynamics of a bursty traffic regulator on state dependent token generation.

(11) Adaptive Traffic Controller for Multicast Packet Switching (Lee)

We investigated several improvements to a nonblocking copy network proposed previously for multicast packet switching. Our improvements provide a complete solution to some system problems inherent in multicasting. The input fairness problem caused by overflow is solved by a cyclic-running adder network (CRAN), which can calculate running sums of copy request starting from any input port. The starting point depends adaptively on the overflow condition of the previous time slot. The CRAN also serves as a multicast traffic controller to regulate the overall copy requests. The throughput of a multicast switch can be improved substantially partial service of copy requests is implemented when overflow occurs. Call splitting can also be implemented by the CRAN in a straightforward manner. Non-uniform distribution of replicated packets at the outputs of a copy network may affect the performance of the following routing network. This output fairness problem is solved by cyclically shifting the copy packets in every time slot. An approximate queueing model is developed to analyze the performance of this improved copy network. It shows that if the loading on each output of the copy network is maintained below 30%, the average packet delay in an input buffer would be less than two time slots.

Part III - Ph.D. Students:

  • Dongmyung Ahn, July 1993
  • Jaewoan Byun, January 1994
  • B-S. Choe, January 1994
  • Jeongkuen Suh, December 1995
  • J. Shik Hong, August 1996
  • Daein Jeong, June 1997
  • Hsiling Cheng, July 1997
  • Jinoo Joung, August 1997
  • Jasmine Chennikara
  • Jason Chou
  • Jongtae Song

Part IV - Publications

"Design and implementation of Abacus switch: A scalable multicast ATM switch"
H. J. Chao, B. S. Choe, J. S. Park, and N. Uzun
IEEE J. Select. Areas Commun., Vol. 15, No. 5, pp. 830-843, June 1997.

"Design of a generalized priority queue manager for ATM switches"
H. J. Chao, H. Cheng, Y. R. Jenq, and D. Jeong
IEEE J. Select. Areas Commun., Vol. 15, No. 5, pp. 867-880, June 1997.

"An ATM routing and concentration chip for a scalable multicast ATM switch"
H. J. Chao and N. Uzun
IEEE J. Solid-State Circuits, Vol. 32, No. 6, pp. 816-828, June 1997.

"Design of an ATM shaping multiplexer with guaranteed output burstiness"
H. J. Chao and J. S. Hong
Intl. Journal of Computer System Science & Engineering, Special issue on ATM Switching, Vol. 12, no. 2, pp. 131-141. March 1997.

"A unified overview of implementing ATM-based enterprise local area network for desktop multimedia computing"
Y. T. Hou, L. Tassiulas, and H. J. Chao
in IEEE Commu. Magazine, vol. 34, no. 4, pp. 70, April 1996.

"On the optically transparent WDM ATM Multicast (3M) switches"
F. S. Choa and H. J. Chao
in Fiber and Integrated Optics, Vol. 15, pp. 109-123, 1996.

"An ATM queue manager with multiple delay and loss priorities"
H. J. Chao and N. Uzun
in IEEE/ACM Trans. on Networking, vol. 3, no. 6, pp. 652-659, Dec. 1995.

"Sizing a packet reassembly buffer at a host computer in an ATM network"
D. E. Smith and H. J. Chao
in IEEE/ACM Trans. on Networking, vol. 3, no. 6, pp. 798-808, Dec. 1995.

"Design and analysis of a large-scale multicast output buffered ATM switch"
H. J. Chao and B. S. Choe
IEEE/ACM Trans. on Networking, vol. 3, no. 2, pp. 112-138, Apr. 1995.

"Architecture design of a generalized priority queue manager for ATM switches"
H. J. Chao and D. Jeong
Intl. Switching Symposium, Germany, Apr. 1995.

"A new QOS-guaranteed cell discarding strategy : Self-Calibrating Pushout"
H. J. Chao and H. Cheng
IEEE GLOBECOM'94, San Francisco, CA, Nov. 1994.

"IP on ATM local area networks"
H. J. Chao, D. Ghosal, D. Saha, and A. K. Tripathi
IEEE Commu. Magazine, Aug. 1994.

"Performance analysis of a large-scale multicast output buffered ATM switch"
B. S. Choe and H. J. Chao
IEEE Proc. INFOCOM'94, Toronto, Canada, June 1994.

"Fault-tolerance of a large-scale multicast output buffered ATM switch"
B. S. Choe and H. J. Chao
IEEE Proc. INFOCOM'94, Toronto, Canada, June 1994.

"Queue management with multiple delay and loss priorities for ATM switches"
H. J. Chao and I. H. Pekcan
IEEE ICC '94, New Orleans, LA, May 1994

"A large-scale multicast output buffered ATM switch"
H. J. Chao and B. S. Choe
IEEE GLOBECOM'93, Houston, TX, Dec.1993.

"Queue management with multiple delay and loss priorities for ATM switches"
H. J. Chao and I. H. Pekcan
(invited) The Annual Allerton Conference on Communication, Control, and Computing, University of Illinois, Oct. 1993

"Characterization of Bursty Sources and Capacity Estimation in High Speed Networks"
R. Boorstyn and A. Kershenbaum
Fifth IEEE Intl. Workshop on Computer-aided Modeling, Analysis and Design of Communication Links and Networks, Princeton, April 1994.

"Call Admission and Source Characterization in High Speed Networks"
J. K. Suh
Ph. D. Dissertation, Polytechnic University, December 1995

"Analysis of Selective Cell Discarding Strategies in ATM Networks"
J. Joong
Ph. D. Dissertation, Polytechnic University, August 1997

"Generalized Recursive Sorting Network"
T. T. Lee
Journal of Parallel and Distributed Computing, Vol.21, No.3, May 1994.

"Broadband Packet Switches based on Dilated Interconnection Networks"
T. T. Lee, S. C. Liew
IEEE Trans. on Communications, Vol.42, No.2, Feb.1994.

"N log N Dual Shuffle-Exchange Network with Error-Correcting Routing"
S. C. Liew, T. T. Lee
IEEE Trans. on Communications, Vol.42, No.3, March 1994.

"Parallel Communications for Network Control and Management"
T. T. Lee, S. C. Liew
IEEE Globecom 93, Houston, Texas, Nov.1993.

"An ATM Multicast Switch with Adaptive Traffic Controller"
J. W. Byun, T. T. Lee
IEEE Trans. on Networking, Vol. 2, No.3, June 1994.