Outlier Discovery Paradigm

Staggering volumes of data sets collected by modern applications from financial transaction data to IoT sensor data contain critical insights from rare phenomena to anomalies indicative of fraud or failure. To decipher valuables from the counterfeit, analysts need to interactively sift through and explore the data deluge. By detecting anomalies, analysts may prevent fraud or prevent catastrophic sensor failures.

While previously developed research offers a treasure trove of stand-alone algorithms for detecting particular types of outliers, they tend to be variations on a theme. There is no end-to-end paradigm to bring this wealth of alternate algorithms to bear in an integrated infrastructure to support anomaly discovery over potentially huge data sets while keeping the human in the loop.

Collaborative Research: ELEMENTS: Tuning-free Anomaly Detection Service

Anomaly detection is critical in many scientific and engineering fields ranging from identifying signatures of new cyberattacks to detecting seizures in EEG medical time series data sets. However, although previous research offers a plethora of anomaly detection algorithms, effective anomaly detection remains challenging for domain experts due because it involves a tedious manual tuning process. Specifically, users have to first hand-craft features to prepare the data, then determine which among many algorithms may be best suited for their particular task, and finally set parameters to assure the chosen algorithm performs well. This is challenging, because domain experts often lack sufficient understanding of specific detection algorithms and of machine learning in general. This project addresses this wide-spread problem by developing a robust self-tuning anomaly detection cyber-infrastructure called Self-Tuning ANomaly Detection service (STAND).

Learn more »



Selected Recent Publications.

  • Huayi Zhang, Lei Cao, Samuel Madden, Elke Rundensteiner. LANCET: Labeling Complex Data at Scale, VLDB 2021.
  • Huayi Zhang, Lei Cao, Peter VanNostrand, Sam Madden, and Elke Rundensteiner, ELITE: Robust Deep Anomaly Detection with Meta Gradient ACM SIGKDD 2021
  • Huayi Zhang, Lei Cao, Yizhou Yan, Elke Rundensteiner and Samuel Madden, Continously Adaptive Similarity Search ACM SIGMOD 2020
All Publications