Project
We are pleased to announce that we have received following fundings for 2019-2020:
Gloria: Graph-based Sharing Optimizer for Event Trend Aggregation, 2022
Large workloads of event trend aggregation queries are widely deployed to derive high-level insights about current event trends in near real time. To speed-up the execution, we identify and leverage sharing opportunities from complex patterns with flat Kleene operators or even nested Kleene expressions. We propose Gloria, a graph-based sharing optimizer for event trend aggregation. First, we map the sharing optimization problem to a graph path search problem in the Gloria graph with execution costs encoded as weights. Second, we shrink the search space by applying cost-driven pruning principles that guarantee optimality of the reduced Gloria graph in most cases. Lastly, we propose a path search algorithm that identifies the sharing plan with minimum execution costs. Our experimental study on three real-world data sets demonstrates that our Gloria optimizer effectively reduces the search space, leading to 5-fold speed-up in optimization time. The optimized plan consistently reduces the query latency by 68%-93% compared to the plan generated by state-of-the-art approaches.
To Share, or not to Share Online Event Trend Aggregation Over Bursty Event Streams, 2021
Complex event processing (CEP) systems continuously evaluate large workloads of pattern queries under tight time constraints. Event trend aggregation queries with Kleene patterns are commonly used to retrieve summarized insights about the recent event trends in event streams. To reduce the processing latency of these aggregation queries, special-purpose optimization techniques, online aggregation and common sub-pattern sharing, have been introduced. However, these methods are limited due to their overhead of repetitive computations or unnecessary pattern constructions. Further, they result in statically selected and hence rigid sharing plans that are often sub-optimal under stream fluctuations. In this work, we thus propose a novel framework Hamlet that is the first to overcome these limitations. Hamlet introduces two key innovations. First, Hamlet dynamically decides whether to share or not to share event trend aggregation queries depending on the current stream properties to harvest the maximum sharing benefit. Second, Hamlet is equipped with a highly efficient shared trend aggregation execution strategy that avoids trend construction. Our experimental study on both real and synthetic data sets demonstrates that Hamlet consistently reduces query latency by up to five orders of magnitude compared to state-of-the-art approaches.
Shared Complex Event Trend Aggregation, 2020
Streaming analytics deploy Kleene pattern queries to detect and aggregate event trends against high-rate data streams. Despite increasing workloads, most state-of-the-art systems process each query independently, thus missing cost-saving sharing opportunities. Sharing complex event trend aggregation poses several technical challenges. First, the execution of nested and diverse Kleene patterns is difficult to share. Second, we must share aggregate computation without the exponential costs of constructing the event trends. Third, not all sharing opportunities are beneficial because sharing aggregation introduces overhead. We propose a novel framework, Muse (Multi-query Shared Event trend aggregation), that shares aggregation queries with Kleene patterns while avoiding expensive trend construction.
Scalable Event Trend Analytics for Data Stream Inquiry, 2018
Data streams have grown in unprecedented scale and velocity in recent years. The real-time discovery of emerging event trends in data streams is essential for time-critical applications from computing infection spread patterns across major medical facilities to detecting frequent stock trends. Unfortunately, event trend analytics, i.e., the aggregation of complex event trends specified using Kleene-closure based patterns, is known to be not only of prohibitively high computational complexity but also to suffer from exorbitant memory utilization costs. This project overcomes the shortcomings of state-of-the-art systems by for the first time providing practical solutions for this important class of analytics.
Event Stream Analytics, 2017
Advances in hardware, software and communication networks have enabled applications to generate data at unprecedented volume and velocity. An important type of this data are event streams generated from financial transactions, health sensors, web logs, social media, mobile devices, and vehicles. The world is thus poised for a sea-change in time-critical applications from financial fraud detection to health care analytics empowered by inferring insights from event streams in real time. Event processing systems continuously evaluate massive workloads of Kleene queries to detect and aggregate event trends of interest. Examples of these trends include check kites in financial fraud detection, irregular heartbeat in health care analytics, and vehicle trajectories in traffic control. These trends can be of any length. Worst yet, their number may grow exponentially in the number of events. State-of-the-art systems do not offer practical solutions for trend analytics and thus suffer from long delays and high memory costs. In this project, we propose the following event trend detection and aggregation techniques.
Complex Event Analytics, 2010
Recent advances in sensor technologies and expansion of wired and wireless communication protocols enable us to continuously collect information about the physical world, resulting in a rich set of novel services. The ability to infer relevant patterns from these event streams in real-time and at various levels of abstractions to make near instantaneous decisions is crucial for a wide range of mission critical applications ranging from real-time crisis management to security. This project designs, implements, and evaluates a novel complex event processing methodology, henceforth called Complex Event Analytics (CEA).