Elliot Linzer, M. Vetterli
Computing
Defining outliers by their distance to neighboring data points has been shown to be an effective non-parametric approach to outlier detection. In recent years, many research efforts have looked at developing fast distance-based outlier detection algorithms. Several of the existing distance-based outlier detection algorithms report log-linear time performance as a function of the number of data points on many real low-dimensional datasets. However, these algorithms are unable to deliver the same level of performance on high-dimensional datasets, since their scaling behavior is exponential in the number of dimensions. In this paper, we present RBRP, a fast algorithm for mining distance-based outliers, particularly targeted at high-dimensional datasets. RBRP scales log-linearly as a function of the number of data points and linearly as a function of the number of dimensions. Our empirical evaluation demonstrates that we outperform the state-of-the-art algorithm, often by an order of magnitude. © 2008 Springer Science+Business Media, LLC.
Elliot Linzer, M. Vetterli
Computing
Thomas R. Puzak, A. Hartstein, et al.
CF 2007
Indranil R. Bardhan, Sugato Bagchi, et al.
JMIS
Minkyong Kim, Zhen Liu, et al.
INFOCOM 2008