HIGH-SPEED NETWORK INTRUSION DETECTION AND PREVENTION

Network Intrusion Detection and Prevention Systems (NIDPSs) are vital in the fight against network intrusions. NIDPSs search for certain malicious content in network traffic (i.e., signatures) using a method called Deep Packet Inspection (DPI), where incoming packet payloads are compared against known attack signatures. Processing every single byte in the incoming packet payload has a very stringent time constraint, e.g., 200 ps for a 40-Gbps line. Traditional DPI systems either need a large memory space or use special memory, such as Ternary Content Addressable Memory (TCAM), limiting parallelism or yielding high-cost/power consumption. 

The main goal of our group is to develop low-cost, high-speed, and scalable DPI methods. To achieve this goal, we have developed two novel hardware architectures named TriBiCa (Trie Bitmap Content Analyzer) (Artan and Chao, 2007) and LaFA (Lookahead Finite Automata) (Bando et al., 2009). TriBiCa provides minimal perfect hashing functionality for detecting malicious signatures with extremely small memory requirements. LaFA is a scalable data structure for detecting Regular Expressions (RegExes) signatures, which are the de facto way to represent NIDPS signatures today. In addition to TriBiCa and LaFA, we have also proposed an Aggregated Bloom Filter (Artan et al. 2007) to increase the scalability and throughput of Bloom Filters.  In our earlier work, we worked on multi-packet signature detection, where signatures are fragmented into multiple packets (Artan and Chao, 2005).Lookahead Finite Automata (LaFA)RegExes have been widely used to represent complex string patterns in DPI due to their flexibility and conciseness. However, it has been very challenging to implement a high-speed RegEx Detection system with a large number of RegEx rules.  
Our proposed Lookahead Finite Automata (LaFA) is a scalable RegEx detection system that can achieve very high-speed operation and accommodate a large number of rules. It can  also support flexible updates. 

LaFA uses three observations to achieve great scalability and high-speed operation.

  1. RegExes consist of a variety of different components such as character classes or repetitions. Due to this variety, it is hard to identify a method that is efficient for concurrently detecting all of these different components of a RegEx. However, in most cases today, these heterogeneous components are detected by a state machine that consists of homogeneous states. This leads to inefficiency and, as a result, the scalability of such systems is poor.
  2. The order of components in a RegEx is preserved in the state machine detecting this RegEx. However, it may be beneficial to change the order of the detection of components in a RegEx. For instance, it is easy to detect simple strings as they are expected to appear less frequently, whereas others are harder to detect (for instance, character classes) and may appear more frequently. By reordering the detection of the components, the trade-off between detection complexity and appearance frequency can be exploited for better scalability.
  3. Most RegExes share similar components. In the traditional FA approaches, a small state machine is used to detect a component in a RegEx. This state machine is duplicated since the similar component may appear multiple times in different RegExes. Furthermore, most of the time, these RegExes sharing this component cannot appear at the same time in the input. As a result, the repetition of the same state machine for different RegExes introduces redundancy and limits the scalability of the RegEx detection system.

LaFA addresses these three issues with three novel methods as summarized in the example in Figure 1.

Figure 1 : An example illustrating the transformation from a RegEx set R into the corresponding LaFA: (a) The RegEx set R, (b) the NFA corresponding to R, (c) separation of simple strings, (d) reordering the detection sequence, and (e) the final step, the sharing of complex detection modules, which leads to the LaFA corresponding to R. 

We have prototyped LaFA on a Virtex 4 Field-Programmable Gate Array (FPGA) based on the hardware architecture shown in Figure 2. LaFA requires an order of magnitude less memory compared to today’s state-of-the-art RegEx detection systems. A single commodity FPGA chip can accommodate up to 25,000 RegExes. Based on the throughput of our LaFA prototype, we expect that a 34-Gbps throughput can be achieved by using a state-of-the-art FPGA. 

Figure 2 : LaFA ArchitectureTrie Bitmap Content Analyzer (TriBiCa)TriBiCa uses a trie structure with a hash function performed at each layer. Branching is determined by the hashing results with an objective to evenly partition attack signatures into multiple groups at each layer. During a query, as an input traverses the trie, an address to a table in the memory that stores all attack signatures is formed and is used to access the signature for an exact match. Due to the small space required, multiple copies of TriBiCa can be implemented on a single chip to perform pipelining and parallelism simultaneously, thus achieving high throughput.

We have designed the TriBiCa on a modest FPGA chip, Xilinx Virtex 2, achieving 10-Gbps throughput without using any external memory. A proof-of-concept design is implemented and tested with 1-Gbps packet streams. By using today’s state-of-the-art FPGAs, a throughput of 40 Gbps is believed to be achievable.

Figure 3: TriBiCa data structure for 8 signatures and a sample query on the TriBiCa data structure.

 

Figure 4: TriBiCa test setup.Aggregated Bloom FiltersAs part of the low-cost, high-speed, and scalable DPI goal, our research also involves optimizing current state-of-the-art building blocks for DPI. One such popular building block is the Bloom Filter (BF). To increase throughput by reducing off-chip access, BFs are used as on-chip filters. We show that BFs have shortcomings when implemented on hardware. Hence, we propose Aggregated Bloom Filters (ABFs) to increase the throughput and scalability of hardware BFs. ABF leverages the query mechanism for hardware BFs by removing redundant hash calculations and redundant on-chip memory accesses for higher throughput. ABF also improves scalability by aggregating small distributed BFs to a single BF for better on-chip memory utilization. ABF shows seven-fold improvement in the average query throughput and four times less memory usage compared to previous hardware BFs for NIDPS.Prefix Bloom FiltersPrecision is also crucial for NIDPS, which can easily be evaded by fragmentation of attack packets. The straightforward defragmentation method is not applicable at high-speeds due to high-memory requirement. In our earlier work, this multi-packet signature detection problem is addressed using a defragmentation-free, space-efficient solution. A new data structure, the Prefix Bloom Filter (PBF), is proposed to significantly reduce the storage requirement of the problem.

References

[1] M. Bando, N. S. Artan, and H. J. Chao, “LaFA: Lookahead Finite Automata for Scalable Regular Expression Detection,” in ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS 2009), Princeton, NJ, Oct. 2009.
[2] N. S. Artan, H. Yuan, and H. J. Chao, “A Dynamic Load-Balanced Hashing Scheme for Networking Applications,” in IEEE Global Communications Conference (GLOBECOM 2008), New Orleans, LA, Nov-Dec 2008.
[3] M. Bando, N. S. Artan, and H. J. Chao, “Highly Memory-Efficient LogLog Hash for Deep Packet Inspection,” in IEEE Global Communications Conference (GLOBECOM 2008), New Orleans, LA, Nov-Dec 2008.
[4] N. S. Artan, M. Bando, and H. J. Chao, “Boundary Hash for Memory-Efficient Deep Packet Inspection,” in IEEE International Conference on Communications (ICC 2008), Beijing, China, May 2008.
[5] N. S. Artan and H. J. Chao, “Design and Analysis of a Multi-packet Signature Detection System,” Int. J. Security and Networks, vol. 2, no.1/2, pp. 122–136, Mar. 2007. 
[6] N. S. Artan, R. Ghosh, Y. Guo, and H. J. Chao, “A 10-Gbps High-Speed Single-Chip Network Intrusion Detection and Prevention System,” in 50th Annual IEEE Global Communications Conference (GLOBECOM 2007), Washington, DC, Nov. 2007.
[7] N. S. Artan, K. Sinkar, J. Patel, and H. J. Chao, “Aggregated Bloom Filters For Intrusion Detection And Prevention Hardware,” in 50th Annual IEEE Global Communications Conference (GLOBECOM 2007), Washington, DC, Nov. 2007.
[8] N. S. Artan and H. J. Chao, “TriBiCa: Trie Bitmap Content Analyzer for High-Speed Network Intrusion Detection,” in 26th Annual IEEE Conference on Computer Communications (INFOCOM 2007), 2007, pp.125–133.
[9] N. S. Artan and H. J. Chao, “Multi-packet Signature Detection using Prefix Bloom Filters,” in 48th  Annual IEEE Global Communications Conference (GLOBECOM 2005), St Louis, MO, Nov-Dec 2005.