Survey on Complex Event Processing and Predictive Analytics

  Lajos Jenő Fülöp, Gabriella Tóth, Róbert Rácz, János Pánczél, Tamás Gergely, Árpád Beszédes and Lóránt Farkas
Observing failures and other – desired or undesired – behavior patterns in large scale software systems of specific domains (telecommunication systems, information systems, online web applications, etc.) is difficult. Very often, it is only possible by examining the runtime behavior of these systems through operational logs or traces. However, these systems can generate data in order of gigabytes every day, which makes a challenge to process in the course of predicting upcoming critical problems or identifying relevant behavior patterns. We can say that there is a gap between the amount of information we have and the amount of information we need to make a decision. Low level data has to be processed, correlated and synthesized in order to create high level, decision helping data. The actual value of this high level data lays in its availability at the time of decision making (e.g., do we face a virus attack?). In other words high level data has to be available real-time or near real-time. The research area of event processing deals with processing such data that are viewed as events and with making alerts to the administrators (users) of the systems about relevant behavior patterns based on the rules that are determined in advance. The rules or patterns describe the typical circumstances of the events which have been experienced by the administrators. Normally, these experts improve their observation capabilities over time as they experience more and more critical events and the circumstances preceding them. However, there is a way to aid this manual process by applying the results from a related (and from many aspects, overlapping) research area, predictive analytics, and thus improving the effectiveness of event processing. Predictive analytics deals with the prediction of future events based on previously observed historical data by applying sophisticated methods like machine learning, the historical data is often collected and transformed by using techniques similar to the ones of event processing, e.g., filtering, correlating the data, and so on.

In this paper, we are going to examine both research areas and offer a survey on terminology, research achievements, existing solutions, and open issues. We discuss the applicability of the research areas to the telecommunication domain. We primarily base our survey on articles published in international conferences and journals, but we consider other sources of information as well, like technical reports, tools or web-logs.

Survey, Complex Event Processing, Predictive Analytics.