Kanak Agarwal, Eric Rozner, et al.
SIGCOMM 2014
Dictionary matching is a commonly used operation in Information Extraction (IE) systems. It involves matching a set of strings in a document against a dictionary of pre-defined patterns. In this paper, we describe a high performance and scalable hardware architecture to enable high throughput dictionary matching on very large dictionaries for text analytics applications. Our hardware accelerator employs a novel hashing based approach instead of commonly used deterministic finite automata (DFA) based algorithms. A limitation of the DFA based approaches is that they typically process one character every cycle, while the proposed hash based scheme can process a string token every cycle, thus achieving significantly higher processing throughput than the DFA based implementations. Our measurement results based on a prototype implementation on an Altera Stratix IV FPGA device indicate that our hardware dictionary matching engine can process typical document streams at a processing rate of ∼1.5GB/s (∼12 Gbps) while simultaneously allowing support for large dictionary sizes containing up to ∼100K patterns, thus making it very useful for IE workload acceleration. © 2013 IEEE.
Kanak Agarwal, Eric Rozner, et al.
SIGCOMM 2014
Raphael Polig, Kubilay Atasu, et al.
FPL 2014
Raphael Polig, Kubilay Atasu, et al.
IEEE Micro
Kanak Agarwal, Harmander Deogun, et al.
ISQED 2006