Staggering volumes of data sets collected by modern applications from financial transaction data to IoT sensor data contain critical insights from rare phenomena to anomalies indicative of fraud or failure. To decipher valuables from the counterfeit, analysts need to interactively sift through and explore the data deluge. By detecting anomalies, analysts may prevent fraud or prevent catastrophic sensor failures.
While previously developed research offers a treasure trove of stand-alone algorithms for detecting particular types of outliers, they tend to be variations on a theme. There is no end-to-end paradigm to bring this wealth of alternate algorithms to bear in an integrated infrastructure to support anomaly discovery over potentially huge data sets while keeping the human in the loop.
Anomaly detection is critical in many scientific and engineering fields ranging from identifying signatures of new cyberattacks to detecting seizures in EEG medical time series data sets. However, although previous research offers a plethora of anomaly detection algorithms, effective anomaly detection remains challenging for domain experts due because it involves a tedious manual tuning process. Specifically, users have to first hand-craft features to prepare the data, then determine which among many algorithms may be best suited for their particular task, and finally set parameters to assure the chosen algorithm performs well. This is challenging, because domain experts often lack sufficient understanding of specific detection algorithms and of machine learning in general. This project addresses this wide-spread problem by developing a robust self-tuning anomaly detection cyber-infrastructure called Self-Tuning ANomaly Detection service (STAND).
Professor of Electrical Eng. and Comp. Science
Massachusetts Institute of Technology
Selected Recent Publications.
We are thankful for the support from NSF for this Outlier Discovery research project. Funding for is also listed on each "Learn More" page.
An interactive demo of our Automatic Anomaly Detection system is available below. AutoOD is a self-tuning anomaly detection system designed to address the challenges of method selection and hyper-parameter tuning while remaining unsupervised. AutoOD frees users from the tedious manual tuning process often required for anomaly detection by intelligently identifying high likelihood inliers and outliers. AutoOD features a responsive visual interface shown in the screenshots below allowing for seamless user interaction providing the user with insightful knowledge of how AutoOD operates.
Input Interface
Results Interface
Input Interface: Users can upload data, provide their own anomaly detection methods, specify the column of labels, and customize the expected percentage range of anomalies in their dataset.
Results Interface: Users can filter the chart based on metrics provided and interact with points by hovering over them to view summery statistics. Clicking on a point will provide that respective point's anomaly score for each unsupervised detector and attribute values from the input dataset. In addition, by moving the slider through each iteration, the user can watch the reliable object set change, and at any time select a point to view the contribution of each detector to its status.