Qian Huang, George C. Stockman
ICPR 1994
Spotting anomalies in large multi-dimensional databases is a crucial task with many applications in finance, health care, security, etc. We introduce COMPREX, a new approach for identifying anomalies using pattern-based compression. Informally, our method finds a collection of dictionaries that describe the norm of a database succinctly, and subsequently flags those points dissimilar to the norm - -with high compression cost - -as anomalies. Our approach exhibits four key features: 1) it is parameter-free; it builds dictionaries directly from data, and requires no user-specified parameters such as distance functions or density and similarity thresholds, 2) it is general; we show it works for a broad range of complex databases, including graph, image and relational databases that may contain both categorical and numerical features, 3) it is scalable; its running time grows linearly with respect to both database size as well as number of dimensions, and 4) it is effective; experiments on a broad range of datasets show large improvements in both compression, as well as precision in anomaly detection, outperforming its state-of-the-art competitors. © 2012 ACM.
Qian Huang, George C. Stockman
ICPR 1994
Fanhua Shang, L.C. Jiao, et al.
CIKM 2012
James E. Gentile, Nalini Ratha, et al.
BTAS 2009
Mahesh Viswanathan, Homayoon S.M. Beigi, et al.
ICDAR 1999