Staggering volumes of data sets collected by modern applications from financial transaction data to IoT sensor data contain critical insights from rare phenomena to anomalies indicative of fraud or failure. To decipher valuables from the counterfeit, analysts need to interactively sift through and explore the data deluge. By detecting anomalies, analysts may prevent fraud or prevent catastrophic sensor failures.
While previously developed research offers a treasure trove of stand-alone algorithms for detecting particular types of outliers, they tend to be variations on a theme. There is no end-to-end paradigm to bring this wealth of alternate algorithms to bear in an integrated infrastructure to support anomaly discovery over potentially huge data sets while keeping the human in the loop.
Anomaly detection is critical in many scientific and engineering fields ranging from identifying signatures of new cyberattacks to detecting seizures in EEG medical time series data sets. However, although previous research offers a plethora of anomaly detection algorithms, effective anomaly detection remains challenging for domain experts due because it involves a tedious manual tuning process. Specifically, users have to first hand-craft features to prepare the data, then determine which among many algorithms may be best suited for their particular task, and finally set parameters to assure the chosen algorithm performs well. This is challenging, because domain experts often lack sufficient understanding of specific detection algorithms and of machine learning in general. This project addresses this wide-spread problem by developing a robust self-tuning anomaly detection cyber-infrastructure called Self-Tuning ANomaly Detection service (STAND).
Selected Recent Publications.
We are thankful for the support from NSF for this Outlier Discovery research project. Funding for is also listed on each "Learn More" page.