A TERABIT IP ROUTER WITH ADVANCED QOS SUPPORT

Part I - Summary of the Project

The Internet is widely considered as the most reachable platform of next-generation information infrastructure. Although IP/DWDM (internet protocol over dense wavelength division multiplexing) technology has been demonstrated to transport 1 terabit/s bandwidth over a single fiber, the capability of routing IP packets at 1 terabit/s still remains to be seen. Furthermore, in order to provide different quality-of-service (QoS) requirements for the Next Generation Internet (NGI), IETF has suggested the Internet Integrated Services (IIS) with resource ReSerVation signaling protocol (RSVP). Although several algorithms have been proposed to guarantee the stringent delay bound for the guarantee service flows, how to implement them at the aggregate rate of terabit/s remains challenging.

This proposal tackles the problem by designing a large-capacity switch fabric for a terabit IP router and tightly controlling packet delay at the input and output ports of the router. The proposed terabit IP switch router consists of multiple routing modules (RMs) that are interconnected by a fast switch fabric. Each RM forwards arriving packets independently, thus allowing the router to scale flexibly up to terabit/s. With the application of the following innovations, supporting the IIS at terabit/s is conceivably achievable.

First, a novel dual round robin (DRR) arbitration scheme is proposed to provide statistically bounded delay at the input ports of a nonblocking input-output buffered crossbar packet switch with a moderate speedup. The DRR is a distributed switch control scheme, in which input selection and output contention resolution are separatedly handled by two independent sets of round-robin arbiters. The switch is used to provide high speed interconnection among RM's, By using bit-slice technique with multiple switch planes, the switch operation speed is reduced and can be implemented with low-cost CMOS technology.

Second, a noveltoken tunneling mechanism is proposed to implement the DRR scheme at terabit/s. It is shown that with state-of-the-art 0.25-micron CMOS technology, the arbitration time can be as small as 10 ns for a 256x256 Tb/s packet switch. This scheme can also be extended to handle multiple-priority traffic. To demonstrate the feasibility of the concept, an application specific integrated circuit (ASIC) that implements the DRR scheme will be prototyped.

Finally, packet fair queuing (PFQ) has been proven capable of providing delay bound guarantee for the IIS. It will be used at the output ports of the router. Arrival packets are usually time stamped based on their committed rates. However, time stamp's computation and sorting always pose a great challenge for high speed networks, especially with hundreds of thousands of flows that need to be supported. Therefore, a novel RAM-based searching engine (RSE) is proposed to implement the PFQ scheduler at 10 Gb/s for each output port at low cost by using off-shelf parts while supporting hundreds of thousands of flows.

Furthermore, due to the heterogeneous networking environment in the Internet, hierarchical packet scheduling is required to accommodate different service classes and multiple Internet Service Providers (ISP's). The shaper-scheduler scheme has been proven capable of providing tight delay bound in the hierarchical packet scheduling environment. Unfortunately, the scheme is very complicated and has not been shown feasible at high speed and at low cost, especially for packet media. Therefore, a general shaper-scheduler is proposed for both ATM and packet networks. The concept of the RSE will be extended to a two-dimensional RSE (2-D RSE) that implements the general shaper-scheduler. To demonstrate the feasibility of the concept, a 2-D RSE will also be prototyped.

All the prototypes and the developed technologies will be made transferable to the industry in order to expedite the deployment of the terabit IP routers in the NGI.

Part II - Technical Information

(1) Pipeline-Based Approach for Maximal-Sized Matching Scheduling in Input-Buffered Switches

We proposes an innovative pipeline-based maximal-sized matching scheduling approach, called PMM, for input-buffered switches. It dramatically relaxes the timing constraint for arbitration with a maximal matching scheme. In the PMM approach, arbitration operates in a pipelined manner. Each subscheduler is allowed to take more than one time slot for its matching. Every time slot, one of them provides the matching result. The subscheduler can adopt a pre-existing efficient round-robin-based maximal matching algorithm. We show that PMM provides 100% throughput under uniform traffic since it preserves a desynchronization effect of the round-robin pointers as in the preexisting algorithm. In addition, PMM maintains fairness for best-effort traffic due to the round-robin-based arbitration.

(2) CIXB-1: Combined Input-One-cell-Crosspoint Buffered Switch

Buffered crossbars have been considered as an alternative for non-buffered crossbars to improve switching throughput. The drawback of a buffered crossbar is the memory amount that is pro-portional to the square of the number of ports (O(N^2)). This is not the main limitation when the buffer size is kept to a minimum size such that implementation is feasible. For a small buffer size, the number of ports of a switch module is not limited by the memory amount but by the pin count. We propose a novel architecture: a Combined Input-One-Cell-Crosspoint Buffer crossbar (CIXB-1) with Virtual Output Queues (VOQs) at the inputs and round-robin arbitration. We show that the proposed architecture can provide 100% throughput under uniform traffic. A CIXB-1 offers several advantages for a feasible implementation such as scalability and timing relaxation. With the currently available memory technology, a one-cell crosspoint buffered switch is feasible for a 32 * 32 fabric module.

(3) CIXOB-k Combined Input-Crosspoint-Output Buffered Packet Switch

We proposes an innovative pipeline-based maximal-sized matching scheduling approach, called PMM, for input-buffered switches. It dramatically relaxes the timing constraint for arbitration with a maximal matching scheme. In the PMM approach, arbitration operates in a pipelined manner. Each subscheduler is allowed to take more than one time slot for its matching. Every time slot, one of them provides the matching result. The subscheduler can adopt a pre-existing efficient round-robin-based maximal matching algorithm. We show that PMM provides 100% throughput under uniform traffic since it preserves a desynchronization effect of the round-robin pointers as in the preexisting algorithm. In addition, PMM maintains fairness for best-effort traffic due to the round-robin-based arbitration.

(4) Fast Fault Detection for a Multiple-plane Packet Switch

In high-speed and high-capacity packet switches, system reli-ability is critical to avoid the loss of a huge amount of information and to avoid re-transmission of traffic. We propose a series of concurrent fault-detection mechanisms for a multiple-plane crossbar-based packet switch. Our switch model, called the m + z model, has m active planes and z spare planes. This switch has distributed arbiters on each plane. The spare planes, used for substitution of faulty active ones, are also used in the fault detection mechanism, thus providing sufficient data redundancy for fault detection and location. Our detection scheme is able to detect a single fault in one time slot without increasing transmission overhead. The proposed schemes can be used for switches with different numbers of active planes and the number of spare planes needed for fault detection is small.

(5) Concurrent Round-Robin-Based Dispatching Schemes for Clos-Network Switches

A Clos-network switch architecture is attractive because of its scalability. Previously proposed implementable dispatching schemes from the first stage to the second stage, such as random dispatching (RD), are not able to achieve high throughput unless the internal bandwidth is expanded.

We present two round-robin-based dispatching schemes to overcome the throughput limitation of the RD scheme. First, we introduce a concurrent round-robin dispatching (CRRD) scheme for the Clos-network switch. The CRRD scheme provides high switch throughput without expanding internal bandwidth. CRRD implementation is very simple because only simple round-robin arbiters are adopted. We show via simulation that CRRD achieves 100% throughput under uniform traffic. When the offered load reaches 1.0, the pointers of round-robin arbiters at the first-stage and second-stage modules are completely desynchronized and contention is avoided. Second, we introduce a concurrent master-slave round-robin dispatching (CMSD) scheme as an improved version of CRRD to make it more scalable. CMSD uses hierarchical round-robin arbitration. We show that CMSD preserves the advantages of CRRD, reduces the scheduling time by 30% or more when arbitration time is significant, and has a dramatically reduced number of crosspoints of the interconnection wires between round-robin arbiters in the dispatching scheduler. This makes CMSD easier to implement than CRRD when the switch size becomes large.

Part III - Personnel Ever Supported:

  • Faculty
    • Professor H. Jonathan Chao - Overall Project
  • Post Doctorate Student(s)
    • Dr. Zhigang Jing - Architecture Design and Performance Study
  • Visiting Scholar(s)
    • Dr. Eiji Oki - Algorithm/Architecture Design
  • Graduate Student(s)
    • Roberto Rojas-Cessa (Ph.D. graduated in April. 2001) - VLSI Design


Part IV - Publications:

"Saturn: A terabit packet switch using dual round-robin"
H. J. Chao
IEEE Communications Magazine, Vol. 8, No. 12, pp. 78-84, Dec. 2000.

"A Pipeline-Based Maximal-Sized Matching Scheme for High-Speed Input-Buffered Switches"
E. Oki, R. Rojas-Cessa, and H. J. Chao
submitted to IEICE Trans. Commun.

"Concurrent Round-Robin-Based Dispatching Schemes for Clos-Network Switches"
E. Oki, Z. Jing, R. Rojas-Cessa, and H. J. Chao
to appear in IEEE/ACM Trans. on Networking.

"A Pipeline-Based Approach for a Maximal-Sized Matching Scheduling in Input-Buffered Switches"
E. Oki, R. Rojas-Cessa, and H. J. Chao
IEEE Communication Letters, vol. 5, no. 6, pp. 263-265, June 2001.

"Fast Ping-Pong Arbitration for Input-Output Queued Packet Switches"
H. J. Chao, C. H. Lam, and X. Guo
International Journal of Communication Systems, vol. 14, pp. 663-678, June 2001.

"CIXOB-k: Combined Input-Crosspoint-Output Buffered Packet Switch"
Rojas-Cessa, E. Oki, and H. J. Chao
in IEEE Globecom Conference, San Antonio, Texas, Nov. 2001.

"Fast Fault Detection for a Multiple-plane Packet Switch"
Rojas-Cessa, E. Oki, and H. J. Chao
in IEEE Globecom Conference, San Antonio, Texas, Nov. 2001.

"CIXB-1: Combined Input-Once-Cell-Crosspoint Buffered Switch"
Rojas-Cessa, E. Oki, Z. Jing, and H. J. Chao
IEEE Workshop on High Performance Switching and Routing, Dallas, TX, July 2001.

"Concurrent Round-Robin Dispatching Scheme in a Clos-Network Switch"
E. Oki, Z. Jing, R. Rojas-Cessa and H. J. Chao
in IEEE ICC, Helsinki, June 2001.

Part V - Significant Educational Impact from the Project

The research results have been included in the PI's book, Quality of Service Control in High-Speed Networks published by John Wiley & Sons, Inc, in Nov. 2001. This book is being used in the Graduate Class EL638 Broadband Networks, in Polytechnic University.