0% found this document useful (0 votes)
11 views13 pages

Automated Rule Generation For Complex Event Processing

Automated Rule Generation for Complex Event Processing, Automated Rule Generation for Complex Event Processing

Uploaded by

st.novikov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

Automated Rule Generation For Complex Event Processing

Automated Rule Generation for Complex Event Processing, Automated Rule Generation for Complex Event Processing

Uploaded by

st.novikov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261644836

Learning From the Past: Automated Rule Generation for Complex Event
Processing

Conference Paper · May 2014


DOI: 10.1145/2611286.2611289

CITATIONS READS

56 288

3 authors:

Alessandro Margara Gianpaolo Cugola


Politecnico di Milano Politecnico di Milano
60 PUBLICATIONS 2,062 CITATIONS 116 PUBLICATIONS 4,403 CITATIONS

SEE PROFILE SEE PROFILE

Giordano Tamburrelli
University of Lugano
42 PUBLICATIONS 1,895 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

ASIA - Application-specific Integrated Aggregation for Publish/Subscribe Systems View project

Green Move View project

All content following this page was uploaded by Alessandro Margara on 22 April 2014.

The user has requested enhancement of the downloaded file.


Learning From the Past: Automated Rule Generation for
Complex Event Processing

Alessandro Margara Gianpaolo Cugola Giordano Tamburrelli


Dept. of Computer Science Dip. di Elettronica, Faculty of Informatics
Vrije Universiteit Amsterdam Informazione e Bioingegneria Università della Svizzera
[email protected] Politecnico di Milano Italiana Lugano
[email protected] [email protected]

ABSTRACT CEP can be applied to several domains: sensor networks


Complex Event Processing (CEP) systems aim at process- for environmental monitoring [13]; payment analysis for
ing large flows of events to discover situations of interest. In fraud detection [46]; financial applications for trend discov-
CEP, the processing takes place according to user-defined ery [20]; RFID-based inventory management for anomaly
rules, which specify the (causal) relations between the ob- detection [51]. More in general, as observed in [33], the
served events and the phenomena to be detected. We claim information system of every company could and should be
that the complexity of writing such rules is a limiting factor organized around an event-based core that acts as a nervous
for the diffusion of CEP. In this paper, we tackle this prob- system to guide and control the other sub-systems.
lem by introducing iCEP, a novel framework that learns, While researchers and practitioners working on CEP fo-
from historical traces, the hidden causality between the re- cused mainly towards processing efficiency (achieving re-
ceived events and the situations to detect, and uses them markable results) [12, 2, 17, 18], widespread adoption of
to automatically generate CEP rules. The paper introduces CEP technologies depends on a correct and precise modeling
three main contributions. It provides a precise definition of the application domain under analysis using CEP rules.
for the problem of automated CEP rules generation. It di- Surprisingly, this aspect has received little attention so far
cusses a general approach to this research challenge that and users are left alone in the hard task of rule definition.
builds on three fundamental pillars: decomposition into sub- To explain the reason of this difficulty, let us refer to the
problems, modularity of solutions, and ad-hoc learning al- concrete case of vehicular traffic analysis. Rules must de-
gorithms. It provides a concrete implementation of this ap- scribe the causal relations between observations (e.g., the
proach, the iCEP framework, and evaluates its precision in a number of cars on a given road, the average speed on a
broad range of situations, using both synthetic benchmarks given path) and the situation that must be detected (e.g.,
and real traces from a traffic monitoring scenario. traffic congestion). A wide range of factors contribute in
defining these causal relations: the events recently received,
the information they carry, and their specific order. In many
Categories and Subject Descriptors cases, these factors are partially or completely unknown.
H.2.4 [Database Management]: Systems—Rule-based To overcome this limitation and wield all the potential
databases; I.2.6 [Artificial Intelligence]: Learning hidden in CEP, we need a set of techniques to support users
in rule definition. More specifically, we have to identify the
conceptual foundations and the key algorithms that allow
Keywords to learn, using available historical information, those CEP
Complex Event Processing, Rule Generation, Learning rules that appropriately express the relevant causal relations
of a given domain.
This is exactly the goal of this paper, which contributes
1. INTRODUCTION to the research on CEP in three ways. First, it provides a
Complex Event Processing (CEP) systems analyze large precise but general definition of the problem of automated
flows of primitive events received from a monitored envi- generation of CEP rules, concretely referring to the oper-
ronment to timely detect situations of interest, or compos- ators of CEP languages as illustrated in [19]. Second, it
ite events. In CEP, the processing takes place according to discusses a general approach to this research challenge that
user-defined rules, which specify the relations between the builds on three fundamental pillars: decomposition into sub-
observed events and the phenomena to be detected [19]. problems, modularity of solutions, and ad-hoc learning al-
gorithms. Finally, it provides a concrete implementation of
this approach: the iCEP framework.
iCEP analyzes historical traces and learns from them. It
Permission to make digital or hard copies of all or part of this work for adopts a highly modular design, with different components
personal or classroom use is granted without fee provided that copies are considering different aspects of the rule, such that users may
not made or distributed for profit or commercial advantage and that copies decide, based on their knowledge of the domain, which mod-
bear this notice and the full citation on the first page. To copy otherwise, to ules to adopt and which hints to provide to guide the learn-
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee. ing process and increase its precision. We evaluate iCEP in a
DEBS ’14, May 26-29, 2014, Mumbai, India.
Copyright 2014 ACM 978-1-4503-2737-4 ...$10.00.
broad range of situations, using both synthetic benchmarks - Negation prescribes the absence of certain events.
and real traces from a traffic monitoring scenario. We are aware that specific CEP languages may implement
The rest of the paper is organized as follows. Section 2 additional operators or variations of those discussed above.
introduces the event and rule models we adopt. Section 3 Nevertheless, by focusing on the operators that are avail-
defines the problem of CEP rules learning, decomposes it able on most CEP languages, our approach provides users
into subproblems, and proposes a general approach to cope with generic, ready to use rules. If necessary, these rules can
with it. Section 4 provides the details of the design and im- be manually tuned to exploit system-specific features (e.g.,
plementation of the iCEP framework. Section 5 evaluates customizable selection and consumption policies [19]) or ad-
iCEP using a number of synthetic and real traces. It high- ditional, not yet supported operators. The interested reader
lights the accuracy of the framework and discusses potential may refer to Section 6 for a more detailed analysis of CEP
limitations and improvements. Finally, Section 6 surveys re- languages and systems.
lated work and Section 7 provides some conclusive remarks. For improved readability, here and in the remainder of the
paper we use an ad-hoc, simple and intuitive syntax for event
2. BACKGROUND patterns, which supports the seven operators above. The
Over the last few years, CEP received increasing attention patterns that follow exemplify this syntax by introducing a
as a mainstream approach for real-time monitoring and sit- few possible definitions for a Fire composite event.
uation detection. By learning and generating rules for CEP Pattern P1
we aim at exploiting the results that the research on CEP within 5min { Smoke(area=$a) and Temp(area=$a and value>50) }
technologies produced so far: low latency processing of in- Pattern P2
formation, scalability in the volume of input events, in the within 5min { Smoke() and Temp(value>50) and Wind(speed>20) }
number of event sources, and in the number of rules [17]. where { Temp->Smoke, Wind->Smoke }
Despite the differences in the various CEP engines and Pattern P3
rule languages both academia and industry proposed so far, within 5min { Smoke() and Avg(Temp.value)>50 and not Rain(mm>2) }
it is possible to identify an abstract event model and a rel- where { Temp->Smoke }
atively small number of abstract operators that cover the Pattern P1 uses the selection operator to accept only Temp
functionalities of most of these systems [19]. In the next notifications whose value exceeds 50; it introduces a window
paragraphs, we introduce these event model and operators. of 5 minutes to find both Smoke and Temp (conjunction);
finally, it imposes a common value for the area attribute
2.1 Event Model in Smoke and Temp (using the parameter $a). Pattern P2
We assume that each event notification is characterized shows how the sequence operator can be used to state that
by a type and a set of attributes. The event type defines both Temp and Wind must precede Smoke. Finally, Pattern
the number, order, names, and types of the attributes that P3 shows an aggregate constraint, imposing that the average
build the event itself. We also assume that events occur value of all observed temperatures must be greater than 50;
instantaneously. Accordingly, each notification includes a moreover, it introduces a negation, stating that Rain must
timestamp, which represents the time of occurrence of the not be detected within the window of observation.
event it encodes. As an example, the following notification:
Temp@10(room=123, value=24.5) 3. ON AUTOMATED RULE GENERATION
captures the fact that the air temperature measured inside This section defines our problem in details and discusses
room 123 at time 10 was 24.5◦ C. the approach we propose for dealing with it.
2.2 CEP Operators 3.1 Problem Statement
In most CEP languages, a composite event CE is defined The problem of automated rule generation involves learn-
using a pattern of primitive events. When such a pattern is ing the causal relations between primitive and composite
identified the CEP engine derives that CE has occurred and events using historical traces. We distinguish between posi-
notifies the interested components. For instance, composite tive traces, in which the composite event occurs, and nega-
event Fire could be derived from the presence of smoke and tive traces, in which the composite event does not occur.
high temperature. When notifications about smoke and high More formally, the problem can be stated as follows.
temperature reach the engine it generates a Fire event. To Given a set of event traces Θ, and a composite event CE
describe patterns of events, CEP languages rely on some such that, for each event trace ε ∈ Θ, either CE is the last
basic building block, or operators. In this paper we consider event in ε (i.e., ε is a positive trace) or it is missing from ε
the most relevant ones, as envisaged in [19]: (i.e., ε is a negative trace), automated rule generation aims
- Selection filters relevant event notifications according at deriving the pattern of primitive events whose occurrence
to the values of their attributes. leads to CE. As an example, from the three traces below:
A@0, A@5, B@10, C@15, CE@15
- Conjunction combines event notifications together. A@0, A@7, A@10, C@15
A@0, B@5, B@27, C@30, CE@30
- Parameterization introduces constraints involving the
values carried by different event notifications. one could infer that CE occurs when:
within 5s { B() and C() }
- Sequence introduces ordering relations among events. where { B->C }
- Window defines the maximum timeframe of a pattern. This is clearly a fictional example. In practice, captur-
- Aggregation introduces constraints involving some ag- ing the many factors that contribute to the occurrence of a
gregated value. composite event requires a very large number of traces.
3.2 The Proposed Approach - A→B: the event of type A must occur before that of type B;
To tackle the problem of automated rule generation, we - A→C: the event of type A must occur before that of type C;
propose an approach that builds on three fundamental pil- - B→C: the event of type B must occur before that of type C;
lars: decomposition into sub-problems, modularity of the
solution, and ad-hoc learning algorithms. By looking at rules and traces from this viewpoint, we can
We observe that the problem of automated rule generation assert that for each rule r and event trace ε, r fires if and
can be decomposed into several learning sub-problems, each only if Sr ⊆ Sε .
one considering a different aspect of the rule to discover. In Given these premises, the problem of learning an unknown
particular, starting from the abstract CEP operators intro- rule r boils down to detecting the set of constraints Sr it
duced in Section 2, we identified seven sub-problems: (i) de- includes. In this setting, given a positive trace ε, Sε can be
termine the relevant timeframe to consider, i.e., the window considered as an overconstraining approximation of Sr . In
size; (ii) identify the relevant event types and attributes; the example above, Sε1 includes all the constraints defined
determine the (iii) selection and (iv ) parameter constraints; by rule r1 (A, B) as well as four additional constraints that
(v ) discover ordering constraints, i.e., sequences; identify are not required to satisfy r1 (C, A→B, A→C, B→C).
(vi) aggregate and (vii) negation constraints. To prune these additional constraints, we can look at the
We claim that a clear decomposition into sub-problems set of positive traces collectively. In particular, we can in-
should not remain at the logical level only, but should drive tersect the sets of constraints satisfied by each and every
the design and implementation of a concrete solution, with positive trace. For example, let us assume that an addi-
each sub-problem addressed by a different module. Indeed, tional positive trace ε2 :A@0,B@3,D@4 exists, which defines
a clear separation among these modules (see Section 4.1 and the following set of constraints Sε2 :
Figure 1) enables them to operate independently from each
- A: an event of type A must occur;
other, possibly using different strategies and technologies
for their implementation. Moreover, it allows for ad-hoc, - B: an event of type B must occur;
customized algorithms that address domain specific require- - D: an event of type D must occur;
ments. Some modules can be even guided or entirely sub- - A→B: the event of type A must occur before that of type B;
stituted by human experts. Finally, an approach based on
- A→D: the event of type A must occur before that of type D;
independent modules eases the extensibility to other CEP
operators that may be introduced in the future. - B→D: the event of type B must occur before that of type D;
Starting from this conceptual model, we designed, built, By intersecting Sε1 with Sε2 , we obtain a more precise ap-
and evaluated different concrete tools for automated rule proximation of Sr1 , which only includes the following three
generation. Our first implementation heavily relied on su- constraints: A, B, A→B. Although this approximation still
pervised machine learning [36]. However, these traditional contains an extra constraint (i.e., A→B), it shows the general
techniques shown some limitations in encoding the operators idea behind our intuition: intersecting the constraints satis-
mentioned above. For example, expressing relations between fied by positive traces enables to (at least partially) prune
attributes (i.e., encoding parameter constraints) was impos- the additional constraints not included in the set of con-
sible without significantly hampering performance. To over- straints defined by the rule, Sr1 in our example.
come these problems, we designed a novel algorithm based Even if this general idea is conceptually simple, its con-
on a key intuition we describe in the next section. crete application presents several challenges when it comes
3.3 Toward a New Learning Strategy for Au- to effectively and efficiently support the CEP operators de-
scribed in Section 2. Among these challenges, omitted in
tomated Rule Generation the example above, we have to learn the size of the window
To solve the automated rule generation problem we can and consider the presence of negations, aggregates, and pa-
start from a relatively simple consideration: both a CEP rameters. Next section shows how we detailed, tuned, and
rule r and an event trace ε can be associated to a set of con- reified this approach into the iCEP framework.
straints. These are the constraints defined by the rule and
satisfied by the trace, respectively. As an example, consider
rule r1 characterized by the pattern:
4. THE iCEP RULE LEARNING SYSTEM
This section presents iCEP. Section 4.1 discusses the high
within 5s { A() and B() } level architecture of the tool, while Section 4.2 explores each
component of this architecture in depth.
Intuitively, it defines the set of constraints Sr1 :1
4.1 The iCEP Architecture
- A: an event of type A must occur;
The problem decomposition described in Section 3.2 di-
- B: an event of type B must occur; rectly reflects on the architecture of iCEP, which consists of
seven distinct modules, as shown in Figure 1:
Similarly, the event trace ε1 :A@0,B@2,C@3 satisfies the set
of constraints Sε1 : - Events and Attributes (Ev) Learner: finds which
event types and attributes are relevant for the rule;
- A: an event of type A must occur;
- Window (Win) Learner: finds the minimal time inter-
- B: an event of type B must occur; val that includes all relevant events;
- C: an event of type C must occur; - Constraints (Constr) Learner: finds the con-
1 straints that select relevant events based on the values
For the sake of readability, this section considers only con-
straints on the presence and order of primitive events. of their attributes;
Negative Traces

Positive Traces
Window Constraints Parameters
(Win) Learner (Constr) Learner (Param) Learner
Negations (Neg)
Learner
Events/Attributes Aggregates Sequences (Seq)
(Ev) Learner (Aggr) Learner Learner

Figure 1: The High-Level Architecture of iCEP

- Parameters (Param) Learner: finds the parameters as described in Section 3.3. While developing iCEP, we tried
that bind the value of attributes in different events; to improve the performance it provides by also adopting
negative traces to discard unrequired constraints, as usually
- Sequences (Seq) Learner: finds the ordering rela-
done in traditional machine learning approaches. However,
tions that hold among primitive events;
this proved to greatly increase the processing time without
- Aggregates (Agg) Learner: identifies the presence yielding significant improvements to the learned rules.
and values of aggregate constraints;
- Negations (Neg) Learner: finds negation contraints; 4.2 iCEP Modules Explained
As shown in Figure 1, each module takes the set of pos- Ev Learner. The role of the Ev Learner is to determine
itive traces available and uses them to determine a specific which primitive event types are required for the composite
aspect of the rule to learn. In particular, each module is event to occur. It considers the size of the window as an
in charge of learning a specific kind of constraints, leverag- optional input parameter. Let us first consider the simple
ing the intuition given in Section 3.3. For instance, the Ev case in which the window size win is indeed provided, e.g.
Learner looks for the event types and attributes that are by domain experts (the more general case in which the size
relevant for the rule, while the Constr Learner finds the of window is unknown will be covered in the next section).
constraints on attribute values. Differently from the other Under this assumption, the Ev Learner can discard events
modules, the Neg Learner, being in charge of learning the occurred outside the scope of the window. More precisely,
events that must not occur, also considers negative traces. it can cut each positive trace such that it ends with the
Modules operate in cascade as shown by the partially or- occurrence Occ of the composite event to detect and starts
dered chain depicted in Figure 1. In particular, some mod- win time instants before Occ.
ules may proceed in parallel, while others require the results For each positive trace, the Ev Learner simply extracts
of the previous ones to operate. It is important to notice that the set of event types it contains. Then, according to the
the first two modules (the Win Learner and Ev Learner) re- general intuition described in Section 3.3, it computes and
quire each other’s output to perform their computation (this outputs the intersection of all these sets.
mutual dependency is indicated by the two arrows that join Finally, the Ev Learner is also responsible for discarding
them in Figure 1). In absence of user-provided informa- the attributes defined in each event type that are not rele-
tion that solves this circular dependency iCEP executes the vant for the composite event. Indeed, in some application
two modules iteratively to learn both the window, the event scenarios, it may be possible that event notifications specify
types, and the relevant attributes for the rule to learn. This a value only for a subset of the attributes defined in their
iterative process is explained in detail later on in this section. types (leaving others empty). Detecting relevant attributes
Organizing modules in cascade provides an additional is conceptually identical to detecting relevant types: the Ev
benefit. Each processing step removes elements from the Learner simply looks at the set of attributes that are defined
traces, thus simplifying and speeding up the computation at least once in every positive trace.
performed in the following steps. As an example, if the Ev Win Learner. The Win Learner is responsible for learning
Learner determines that only the events of types A or B are the size of the window that includes all primitive events
relevant for the rule to learn, it can simplify the traces by re- relevant for firing a composite event CE.
moving all events with a different type before passing them Let us first assume for simplicity that the set of relevant
to the Constr Learner. primitive event types (and attributes) Sτ is known. In this
The modular architecture of iCEP concretely realizes the case, the Win Learner analyzes past occurrences of CE and
design pillars discussed in Section 3, by delegating sub- selects the minimum window win such that, for every oc-
problems to specific software components. This enables currence Occ of CE, all elements in Sτ are contained in a
iCEP users to select and adopt only a subset of the modules timeframe ending with Occ and having size equal to win.
mentioned above to integrate their (partial) knowledge of However, as we already observed, it is possible that both
the domain. For instance, if the set of relevant events and the window size and the set Sτ are unknown. This results in
attributes is known, it can be explicitly provided to the two the mutual dependency between the Ev Learner and the Win
modules Constr Learner and Win Learner, thus eliminat- Learner shown in Figure 1. In absence of external hints from
ing the need for the Ev Learner. domain experts, we address this dependency by using an
As a concluding remark, notice that no module (except for iterative approach, which solves the two learning problems
the Neg Learner) exploits the information contained in neg- at once. In particular, we progressively increase the window
ative traces. All of them implement and refine the concep- size and, for each considered size ws , we execute the Ev
tual idea of intersecting constraints from all positive traces Learner assuming such a window ws as the correct one. In
25

Number of Event Types


value of the constants (e.g., 0 in our previous example). Let
20
15
us consider the following positive traces:
10 T(a=0)
5 T(a=2)
0 T(a=9)
0 20 40 60 80 100 120 140 160 180 200
Window Size
First of all, iCEP looks for equality constraints, by inter-
secting the values of attribute a in the three positive traces.
Figure 2: Executing the Win Learner. Size of Sτ
By doing this, iCEP deduces that no equality constraints
with different window size. Positive traces gener-
exist for a. As a second step, iCEP looks for inequality
ated from a rule requiring three specific event types
constraints. Without any additional knowledge, iCEP ex-
to appear in a window od size 20.
tracts the minimum m and maximum M values for the at-
tribute a in all the traces, and builds a constraint in the
doing so, we notice that initially the size of Sτ monotonically form m ≤ a ≤ M . Referring to the example above, iCEP
increases with ws (as more events enter the window, more would produce a 0 ≤ a ≤ 9.
event types are detected as relevant), but this behaviour To improve the precision of this result, iCEP may exploit
has an end. At a certain point this growth stops and the hints provided by domain experts. For example, they can
size of Sτ stabilizes in a plateau. In practice, this happens suggest the kind of relation to be learned (≥ in our example).
when ws reaches the value of the window to learn. This This hint overwrites the default behavior of iCEP, which in
behaviour is exemplified in Figure 2, which shows the size our example would produces the correct constraint a ≥ 0.
of Sτ during a concrete execution of the Ev Learner with a In the previous example we focused on constraints pred-
growing window size in a situation where the rule to learn icating on a single attribute. iCEP supports the general
requires three specific event types to appear in a window case of constraints involving multiple event attributes (e.g.,
of size 20. As anticipated, we notice how the size of Sτ a ≥ 10 and b ≤ 20): in this case we rely on existing
shows a plateau when ws reaches a value of 20. With this techniques, i.e., Support Vector Machines [15], to learn the
window size, Sτ contains exactly the three primitive event constraints that better describe the portions of the (multi-
types required by the rule. dimensional) attribute space observed in positive traces.
Starting from this consideration, the algorithm imple- Agg Learner. As shown in Figure 1, the Agg Learner runs
mented in the Win Learner works as follows: it iteratively in parallel with the Constr Learner. The two components
invokes the Ev Learner with a growing window size and se- have many similarities, since they both are responsible for
lects the first plateau having a size greater than a value p, learning constraints on the content of events. However, dif-
which is provided as a parameter; the resulting window size ferently from the Constr Learner, the Agg Learner does
is the starting point of such a plateau (20 in our example). not consider the values carried by individual events. Instead,
Constr Learner. With reference to the general architecture it considers the values computed by aggregation functions
in Figure 1, the Constr Learner receives in input the posi- over all the events of a certaint type.
tive traces after they have been filtered by the Win Learner The Agg Learner we implemented natively supports the
and Ev Learner. The former resized traces based on the learning of the following aggregation functions: sum, count,
learned window size, while the latter removed irrelevant minimum, maximum, and average. It is possible, for
event types. The goal of the Constr Learner is to learn the example, to learn constraints having the following form:
set of constraints that select the relevant primitive events Avg(Temp.value)>50, which demands the average value
based on the value of their attributes. computed over all the events of type Temp to be greater
Based on our experience, the Constr Learner is a criti- than 50. To increase the customization and the adaptation
cal component of iCEP since it deals with a learning sub- of iCEP to different application domains, we also support
problem characterized by a large solution space. Indeed, user-defined aggregation functions.
depending on the specific application as well as on the sit- After computing the values of aggregates in all positive
uation to detect, the Constr Learner should be capable of traces, the Agg Learner implements the same algorithm
learning different classes of constraints. already described for the Constr Learner. In particular,
The most elementary case is represented by equality con- it first extracts possible aggregate constraints involving an
straints, which impose an equality between the value of an equality relation; if it does not find any of them, it considers
attribute a and a constant c. Detecting equality constraints constraints involving inequality relations.
involves detecting which constant values are associated to an Param Learner. The Param Learner receives in input the
attribute a in all positive traces. This can be easily imple- positive traces after all events that do not match the con-
mented following the core principle described in Section 3.3, tent constraints identified by the Constr Learner have been
i.e., by extracting the set of values for a from all positive removed. Its goal is to extract the parameter constraints
traces and computing their intersection. among the remaining events. Recalling the definition in Sec-
In addition to equality, for numerical attributes iCEP sup- tion 2.2, parameters predicate on the relations between the
ports constraints involving inequality relations (i.e., ≥, ≤, value of attributes belonging to different events. iCEP sup-
6=). As an example, let us consider the following constraint ports equality as well as, in the case of numerical values,
for an event T that predicates on attribute a: a ≥ 0. Intu- inequality relations.
itively, inequality constraints cannot be inferred by simply Following the general principle of intersection among con-
intersecting samples in positive traces. More specifically, straints that underpins iCEP, the Param Learner operates
learning these constraints involves two tasks: finding the in two conceptual steps. (i) First, it considers each positive
specific relation to use (e.g., ≤ or ≥) and extracting the trace ε in isolation and extracts all possible relations among
the attribute of different events appearing in ε. Consider for Number of event types 25
example the following trace ε1 : Distribution of type Uniform
Number of attributes per event 3
A@10(x=1, y=10), B@12(z=1), C@15(w=1) Number of possible values for attributes 100
Number of constraints per event 3
The Param Learner extracts the following relations among = (20%)
Distr. of op. for select. constraints
attributes: A.x = B.z, A.x = C.w, B.z = C.w, A.y > B.z, ≥ (40%)
A.y > C.w. (ii) Second, the Param Learner considers all ≤ (40%)
positive traces together, and computes the intersection of Number of positive traces 1000
Number of primitive events in R 3
the extracted relations among attributes. Average window size in R 10 s
Seq Learner. The Seq Learner works in parallel with re- Number of parameters in R 0
spect to the Param Learner and receives the same input Number of sequence constraints in R 0
from the previous modules. It produces the set of ordering Number of aggregates in R 0
Number of negations in R 0
constraints, or sequences, that has to be satisfied to trigger Distance between primitive events 1s
the occurrence of a composite event. Once again, it imple-
ments the intersection approach described in Section 3.3. Table 1: Parameters in the Default Scenario
(i) First, it considers each trace in isolation and extracts
all the ordering constraints among the events appearing in Recall (95% confidence interval) 0.9817 (0.0099)
Precision (95% confidence interval) 0.9419 (0.0168)
it. As an example, consider again the trace ε1 illustrated in
the previous section. It includes three sequence constraints:
Table 2: Results of the Default Scenario
A→B, B→C, A→C. (ii) Second, the Seq Learner intersects all
the sequence constraints extracted from individual traces,
keeping only those that appear in all traces. assume to perfectly represent the domain of interest. Next,
the benchmarking framework uses R to detect all the com-
Neg Learner. As shown in Figure 1, the Neg Learner,
posite events in the training history, splitting the training
differently from the other modules, also considers negative history into a set of positive and negative traces. Finally, it
traces. Indeed, this module is in charge of finding primitive invokes iCEP to analyze positive and negative traces from
events that must not appear in a trace to trigger the occur- the training history and learn a rule R∗ .
rence of a composite event. To do so, the Neg Learner takes To quantitatively measure the performance of iCEP, the
as input the set C of all the constraints generated by the benchmarking framework generates an evaluation history of
other modules in previous steps. Then, it selects the nega- primitive events and uses both R and R∗ to detect com-
tive traces that satisfy all the constraints in C. In absence posite events over it. This allows us to measure the recall
of negations, all the traces satisfying the constraints in C of our algorithms, which is the fraction of composite events
should be positive traces, by definition. The fact that the captured by R that have been also captured by R∗ ; and the
selected traces are negative implies that they contain some precision, which is the fraction of composite events captured
additional primitive event en that prevented the occurrence by R∗ that actually occurred, i.e., that were also captured
of the composite event. In other words, en represents the by R. Notice that the benchmarking framework stores both
negated event that iCEP needs to learn. rule R and rule R∗ for each experiment it executes. This en-
To identify the type and content of en , the Neg Learner abled us to study the syntactic differences between the two
applies the same algorithms adopted by the Ev Learner and rules and to isolate the errors introduced by the different
the Constr Learner to the selected negative traces. In par- learning modules.
ticular, it first looks for common event types appearing in all
the selected traces, and then extracts the constraints that Experiment Setup. Several parameters influence the gener-
predicate on their content. The results produced by this are ation of: the rule R, the training history, and the evaluation
appended to the learned rule as negation constraints. history. For a broad analysis of this space, we defined a de-
fault scenario, whose parameters are listed in Table 1, and
we investigated the impact of each parameter separately.
5. EVALUATION Our default scenario considers 25 types of primitive events
This section evalutates the accuracy of iCEP in a number T1 . . . T25, each with the same frequency of occurrence. Each
of different scenarios, based both on synthetic benchmarks primitive event contains three attributes, each one holding
and on real datasets. To enable the replicability of all the an integer value between 0 and 100. Rule R has the following
results discussed in this section, iCEP is publicly available2 . form:

5.1 Synthetic Benchmarks within 10 s { T1(c1 and c2 and c3) and


T2(c4 and c5 and c6) and
To thoroughly explore the parameters that can influence T3(c7 and c8 and c9) }
the behavior of iCEP, we designed and implemented a frame-
work for synthetic benchmarking, which generates events where c1,... c9 are elementary constraints on attribute
and rules based on a number of controllable parameters. values. In our default scenario, rule R does not include
Figure 3 shows the workflow of our synthetic benchmarking parameters, aggregates, sequences, or negations. Both the
framework. First, it randomly generates a training history of training and the evaluation histories include one primitive
primitive events. Then, it defines an oracle rule R, which we event per second, on average, which fire 1000 composite
2
iCEP is written in Java. The source code, the events (i.e., positive traces). All our experiments have been
datasets used in our experiments, and the documen- repeated ten times, using different seeds to generate random
tation to replicate the experiments are available at values. For each experiment, we plot the average value of
http://www.inf.usi.ch/postdoc/margara/software/iCEP.zip recall and precision, and the 95% confidence interval.
Detect all composite events in
Generate Rule R
evaluation history using R

Generate training history of Detect all composite events in Generate evaluation history of Compare rules to determine
primitive events training history primitive events recall and precision

Execute iCEP Detect all composite events in


to learn Rule R* evaluation history using R*

Figure 3: Workflow for Synthetic Benchmarking

Default Scenario. Table 2 shows the results we measured drop in precision when R includes only one or two primi-
in our default scenario. First of all, the recall is very close tive events. Indeed, we observed that the Constr Learner
to one, meaning that the rules generated by iCEP are capa- usually generates selection constraints that are less selective
ble of capturing almost all the composite events that occur (i.e., easier to satisfy) than those present in the oracle rule
in the evaluation history. Also the precision is above 0.94, R. This results in producing false positives, i.e., in gener-
meaning that the rule detects about 6% more events than ating more composite events than those actually occurring
those actually occurred (false positives). A remarkable re- in the evaluation history. As Figure 4 shows, the impact of
sult. Moreover, the results are stable over multiple runs: this evaluation error decreases with the number of primitive
for both recall and precision the 95% confidence interval is events. Indeed, selecting a wrong event has fewer chances
lower than 2% of the measured value. to trigger a false occurrence when other events are required
By analyzing the differences between the oracle rule R and by the rule.
the learned rule R∗ , we observed that iCEP always correctly Size of Window. Figure 5 shows how the size of the win-
identifies the types and attributes of the primitive events in dow in R impacts on the performance of iCEP. Also in this
R, as well as the window size. Moreover, iCEP correctly case, the recall remains almost constant. Conversely, the
identifies the absence of parameters, sequences, and nega- precision slightly decreases with the window’s size. This is
tions. The only (limited) differences between R and R∗ are not a consequence of an error in learning the window size
represented by the constraints on the content of attributes, itself. Indeed, the Win Learner always identifies the correct
as learned by the Constr Learner and Agg Learner. In par- size of window. On the other hand, a large window has a
ticular, the Constr Learner correctly identifies all equality negative impact on the EV Learner. This can be explained
constraints, while it sometimes over or under estimates the by noticing that a large window w will include a large num-
value in inequality constraints (i.e., constraints involving the ber of events, thus increasing the chances of finding events
≤ or ≥ operators). Similarly, the Agg Learner produces ag- in w that appear in all positive traces and become part of
gregate constraints that are not present in rule R, typically the learned rule R∗ even if they are not part of R.
introducing a lower (upper) bound for the minimum (maxi-
Number of Event Types. Figure 6 analyzes the recall and
mum) value of an attribute. As explained in Section 4.2, we
precision of iCEP while changing the overall number of event
expect these imprecisions to be mitigated by the presence of
types appearing in the training and in the evaluation histo-
hints from domain experts.
ries. Since types are uniformly distributed, the frequency
Time and Memory Requirements. We consider iCEP as of occurrence of each type Ti decreases with their number.
an offline tool, whose results should be used to determine Thus, as the number of different types decreases, it becomes
the rules to be subsequently deployed into a CEP engine. more and more difficult for iCEP (and in particular for the
However, to provide an idea of the costs of the algorithms we Ev Learner) to discriminate between relevant and irrelevant
propose, we also report here the time required to analyze the types for the rule R. This explains the behavior of Figure 6:
training history and determine rule R∗ . Using our reference with a small number of types (less than five) we measure a
hardware (a Phenom II X6 1055T @2.8 GHz with 8 GB of recall of 0.71. However, this value rapidly increases, up to
RAM), a complete run of our default scenario (including 0.98, with seven or more types.
the time to compute the recall and precision) requires less
Number of Selection Constraints. This section analyzes
than 35s to conclude. Our experiments show that this time
how the results of iCEP change when increasing the number
increases at most linearly with the number traces and with
of selection constraints for each primitive event appearing
the number of events appearing in each trace. Even the
in R. As Figure 7 shows, the number of constrains has
most complex tests presented in the remainder of this section
virtually no impact on the recall. On the other hand, a small
required less than ten minutes to run. Finally, we measured
number of constraints (less than three) negatively impacts
a maximum memory consumption of less than 1.5 GB. These
on the precision. We already observed a similar behavior in
numbers allow us to conclude that neither the processing
Figure 4: when rule R becomes less selective the difficulty
time nor the memory consumption represent a limit for the
of the Constr Learner in predicting the correct selection
adoption of iCEP.
constraints become more evident.
Number of Primitive Events. Figure 4 shows how the re-
Number of Parameters. This section explores the accuracy
sults produced by iCEP change with the number of primi-
of the Param Learner in detecting parameter constraints.
tive events appearing in rule R. First of all, we measured
As shown in Figure 8, introducing parameter constraints
a constant recall, above 0.97, meaning that the learned rule
does not introduce any visible impact on precision (con-
R∗ correctly identifies almost all the composite events in
stantly above 0.94) and on recall (constantly above 0.96).
the evaluation history, independently from the number of
primitive events in R. On the other hand, we measure a Number of Sequences. This section analyses the impact
1 1
0.8 0.8
0.6 0.6
0.4 0.4
Recall 0.2 Recall
0.2 Precision
Precision
0 0
1 2 3 4 5 6 5 10 15 20 25 30 35
Number of Primitive Events Windows Size (s)

Figure 4: Number of Primitive Events in R Figure 5: Size of Window in R

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 Recall Recall
Precision 0.2
Precision
0 0
0 20 40 60 80 100 1 2 3 4 5
Number of Event Types Number of Constraints

Figure 6: Number of Event Types Figure 7: Number of Selection Constraints in R

of sequence constraints on iCEP. To do so, we repeated the More interesting is the behavior of iCEP in presence of
experiment shown in Figure 4, by measuring the recall and two negations. In this case, the presence of at least one
precision of iCEP while increasing the number of primitive negated event is sufficient to prevent the firing of a composite
events involved in rule R. However, in this case we also intro- event. Because of this, the algorithm implemented in the Neg
duced sequence constraints, forcing all the primitive events Learner, which extracts common elements from all negative
in R to appear in a precise and predefined order (i.e., in a traces, leads to a lower accuracy. Indeed, in presence of two
sequence). Figure 9 shows the results we measured: both negated primitive event types, Tn1 and Tn2, some negative
precision and recall are almost identical to those measured traces may include only Tn1, while others may include only
in Figure 4 (without ordering constraints). iCEP, and in Tn2; thus, the intersection of all negative traces may yield to
particular the Seq Learner, was always capable to identify an empty set. This is what happened in our experiment: the
sequence constraints correctly. Neg Learner fails to correctly identify negation constraints
Number of Aggregates. This section explores the behavior when more than one event is negated. In such cases the
of iCEP in presence of aggregates. We considered five dif- precision drops below 0.5. As a positive side effect, the recall
ferent aggregation functions: minimum, maximum, average, returns to 0.98. We further discuss this issue in Section 5.3.
count, and sum. Figure 10 shows the results we measured. Number of Traces. Finally, this section investigates how
As in most previous experiments, the recall remains constant the performance of iCEP changes with the number of posi-
and above 0.98. However, we register a drop in precision, tive traces available (i.e., composite events). This is a crucial
which moves from 0.94 (with no aggregate constraints) to parameter for a learning algorithm. Moreover, in some real
0.78 (with one aggregate constraint). This is due to an inac- situations, composite events can occur quite infrequently
curate detection of aggregate constraints. In particular, in (e.g., monitoring of an exceptional system’s behavior), and
presence of an aggregate constraint involving type Ti, the collecting a large number of positive traces can be difficult.
Ev Learner always recognizes that Ti is required to fire the As expected, by looking at Figure 12, we observe that a
occurrence of a composite event. In some cases, however, very small number of positive traces (below 40) results in
the Agg Learner fails to recognize the presence of an aggre- poor recall and precision. In this situation, almost all the
gate constraints involving Ti, or, more commonly, generates modules of iCEP provide inaccurate results. However, 40
an under-constraining aggregate. Nevertheless, precision re- positive traces are already sufficient to overcome a value of
mains constant as the number of aggregates increases. 0.9 in both precision and recall. With 100 positive traces,
Presence of Negations. This section investigates the im- they both overcome 0.95. After this point, iCEP provides
pact of negations on the accuracy of iCEP. As in the case stable results.
of aggregates, we started from our default scenario and we
added one or more negation constraints, each of them predi- 5.2 Traffic Monitoring Scenario
cating over a different event type, and including three selec- The use of synthetic benchmarks enabled us to extensively
tion constraints. Figure 11 shows the results we measured. investigate the accuracy of iCEP. To shed light on the con-
When moving from zero to one negation constraint, we no- crete applicability of iCEP, we applied it to a real world
tice that the precision remains almost constant, while the dataset including timestamped information about the po-
recall decreases from 0.98 to 0.85. We already observed in sition and status of buses in the city of Dublin3 . We used
previous experiments that the selection constraints gener- the dataset to generate primitive events about the traffic. In
ated by iCEP are usually less selective than the ones in the particular, we defined a different event type for each bus line
oracle rule R. This also occurs when estimating the selection (75 in total), where each type defines five attributes includ-
constraints for the negated event: iCEP correctly identifies ing the bus number and the delay w.r.t. schedule, which are
the type and attributes of the negation, but it generates se- all part of the dataset. Each bus approximately generates
lection constraints that capture more events than required, 3
http://dublinked.com/datastore/datasets/
thus reducing the recall. dataset-304.php
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 Recall 0.2 Recall
Precision Precision
0 0
0 1 2 1 2 3 4 5 6
Number of Parameters Number of Primitive Events

Figure 8: Number of Parameter Constraints in R Figure 9: Presence of Sequence Constraints in R

1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 Recall 0.2 Recall
Precision Precision
0 0
0 1 2 3 4 5 6 0 1 2
Number of Aggregates Number of Negations

Figure 10: Number of Aggregate Constraints in R Figure 11: Number of Negation Constraints in R

an update every two seconds; thus, every minute contains From the observation above, we can derive a more gen-
several events of the same type. This is a very challenging eral conclusion concerning the characteristic of input data:
scenario for iCEP, since even windows of limited size are since iCEP starts filtering positive traces based on the event
likely to include multiple occurrences for all event types. types they include, it provides the best accuracy in all the
To validate iCEP, we defined a rule R similar to the rule il- scenarios where each type encodes the occurrence of an ex-
lustrated in the default scenario. In particular, the rule fires ceptional fact. Conversely, as discussed in our case study of
a composite event Alert whenever at least one bus from n traffic monitoring, when events in traces encode periodic up-
different (and pre-defined) lines has more than one minute dates about the state of the environment, we observe some
of delay. We evaluate the accuracy of iCEP by progressively differences between rule R and R∗ . Nevertheless, also in this
increasing the number of lines n involved in the rule. We case, both precision and recall remain close to one.
extracted from the dataset information concerning two dif- Another important aspect that emerged from our analy-
ferent days of operation. We use the first day to train iCEP sis is the crucial role of the Constr Learner. On the one
and the second one to evaluate the precision and recall of the hand, almost all modules in iCEP rely on the knowledge of
learned rule R∗ . Figure 13 shows the results we measured. selection constraints. On the other hand, detecting them
We obtained a very high recall and precision, independently is very challenging since it requires to generate selection
from the number of event types (i.e., bus lines) appearing criteria for events based only on available observed values.
in R. Nevertheless, if we look at the learned rule R∗ , we As observed in different experiments, this complexity some-
observe some differences w.r.t. R, with the former often in- times translates in results that are not fully accurate. We
cluding additional event types not contained in R. Indeed, believe that domain specific techniques and optimizations
the high density of events made it difficult for iCEP to dis- could contribute to enhance the accuracy of the Constr
criminate between relevant and non-relevant event types. Learner. We plan to apply iCEP to specific scenarios (e.g.,
On the other hand, this does not impact the ability of the system security) to validate this hypotesis.
learned rule to detect all and only those composite events Finally, there are cases in which the general idea of inter-
that appear during the second day (the evaluation history). secting constraints may generate less accurate results. We
experienced this limitation in the case of multiple negations,
where the firing of a rule can be prevented by the presence
5.3 Discussion of different negated events. Despite our previous investiga-
This section presented a detailed evaluation of iCEP. In tions with alternative solutions (e.g., machine learning tech-
almost all our tests, we measured a high recall and preci- niques [36]) this remains an open issue that we plan to in-
sion, even when considering relatively large evaluation win- vestigate further in future work.
dows and heterogeneous constraints including parameters,
sequences, and aggregates. Although iCEP may receive ex-
ternal hints to improve its accuracy, we never exploited this
6. RELATED WORK
feature: the learning process was completely automated. This section revises related work. First, it presents ex-
During our tests, we observed the impact of the size and isting CEP systems, with the purpose of showing the ap-
quality of the training dataset. Precision and recall de- plicability of iCEP. Second, it introduces research efforts
creased with a (very) limited number of traces (see Fig- in the CEP area that are related to rule learning. Finally,
ure 12). However, for our default scenario, 100 traces were it presents existing approaches that deal with the general
already enough to obtain a recall and precision above 0.95. problem of extracting knowledge from data traces.
Similarly, the accuracy of results decreased in presence of Stream and Event Processing Systems. The last few years
a small number of primitive event types (see Figure 6) or have seen an increasing interest in technology for stream and
with large windows (see Figure 5). Indeed, in both cases, event processing, and several systems have been proposed
the evaluation window includes multiple events of each type, both from the academia and from the industry [33, 24]. De-
which may bias the learning process. spite all existing solutions have been designed to accomplish
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 Recall 0.2 Recall
Precision Precision
0 0
10 100 1000 1 2 3
Number of Composite Events Number of Primitive Events (Bus Lines)

Figure 12: Number of Composite Events Figure 13: Number of Bus Lines

the same goal, i.e., to process large volumes of flowing data Rule Learning and Related Efforts in CEP. Concerning
on-the-fly, they present different data models and rule def- the specific problem of learning CEP rules we can mention
inition languages, as well as processing algorithms and sys- some relevant existing research efforts. For example, [38]
tem architectures. For example, several kinds of languages describes a rule learning approach based on an extension
have been proposed that range from extensions of regular of Hidden Markov models [42] called Noise Hidden Markov
expressions [32, 12], to logic languages [5, 16, 17, 8], and models that can be trained with existing, low-level, event
declarative languages [2, 46]. According to the analysis pre- data. This work precisely addresses the same research goal
sented in [19] we can roughly identify two main classes of of iCEP, however, it differs in several fundamental aspects.
systems. On the one hand, the database community gave For example, differently from iCEP, it only processes and
birth to Data Stream Management Systems (DSMSs) [10] learns elementary events (i.e., attributes are not supported)
to process generic information streams. On the other hand, and does not provide the same level of expressiveness as
the community working on event-based systems focused on iCEP. Indeed, it does not support some operators commonly
a form of data, event notifications, with a specific semantics, offered in CEP languages (e.g., aggregates and parameters).
in which the time (of occurrence) plays a central role [37]. Worth to mention is also [50], which discusses an iterative
DSMSs usually rely on languages derived from SQL, which approach for automating both the initial definition of rules
specify how incoming data have to be transformed, i.e., se- and their update over time, combining partial information
lected, joined together, and modified, to produce one or provided by the domain experts with machine learning tech-
more output streams. Processing happens in three steps [6]: niques. This work differs from iCEP in two distinct aspects:
first, Stream-to-Relation (or windows) operators are used to it explicitly requires the human intervention in its tuning
select a portion of a stream and to implicitly create tradi- phase and it does not provide the same level of expressive-
tional database tables. The actual computation occurs on ness as iCEP in terms of learned operators (e.g., negations
these tables, using Relation-to-Relation (mostly SQL) oper- aggregates). Similarly, [47] discusses an iterative approach
ators. Finally, Relation-to-Stream operators generate new to support the domain expert in designing new rules and
streams from tables, after data manipulation. Despite sev- identifying missing ones.
eral extensions have been proposed [23, 49, 39], they all rely Proactive event processing [22] is an additional research
on the general processing schema described above. goal strictly related to the idea of automated rule genera-
At the opposite side of the spectrum, CEP systems are tion. It aims at predicting the occurrence of future events,
explicitly designed to capture composite events (or situa- thus enabling actions that can mitigate or eliminate unde-
tions of interests) from patterns of primitive ones [24]. CEP sired situations. Connected to proactive processing is the
systems often trade simplicity and performance for expres- idea of computing the degree of uncertainty (often modeled
siveness, providing a reduced set of operators: for exam- as probability) associated to composite events. The first
ple, some languages force sequences to capture only adjacent model proposed for dealing with uncertainty in CEP is de-
events [12]; negations are rarely supported [32, 12] or they scribed in [52], and extended in [53, 54], where the authors
cannot be expressed through timing constraints [3]. iCEP introduce a general framework for CEP in presence of un-
mainly targets this second class of systems, with the aim of certainty. Similar approaches have been studied in [44, 21,
learning the causal relations between the presence of prim- 43]. Worth to mention is [9], where the authors tackle the
itive events and the occurrence of a composite one. Never- issue of uncertainty in transportation systems and explore
theless, most of the results we presented can be applied to a logic-based event reasoning tool to identify regions of un-
both classes of systems. Our long term goal in the develop- certainty within a stream. Finally, a tutorial on event pro-
ment of iCEP is to exploit all the operators offered in the cessing under uncertainty has been presented in the DEBS
most expressive rule definition languages, thus enabling the (Distributed Event Based Systems) 2012 conference [7]. We
derived rules to capture causal relationships present in the plan to explore these models to understand a potential inte-
observed environment as precisely as possible. gration with iCEP, to offer some indication about the confi-
In this perspective, it is important to mention that some dence one can put on the learned rules and their derivations.
CEP systems adopt an interval-based, as opposed to point- Related Learning Approaches. The general problem of
based, time semantics: time is modeled using two times- knowledge learning and mining from (time-annotated) data
tamps that indicate the interval in which an event is con- traces has been extensively studied in the past (e.g., [27]).
sidered valid [1]. In addition, some CEP systems define The interested reader can refer to [45, 31] for an extensive
iteration operators (Kleene+ [28]) to capture a priori un- description and classification of the solution proposed. The
bounded sequences of events. This is typically exploited to techniques adopted in iCEP are close to the algorithms de-
detect trends (e.g., constantly increasing value of tempera- scribed in [35, 40]. Worth to mention among them are the
ture). Trend detection as well as the extension of iCEP to approaches that adopt multi-dimensional variables to sup-
interval-based semantics are part of our future work.
port events with multiple attributes. Such techniques are 8. REFERENCES
considered one of the most important future trends in data [1] R. Adaikkalavan and S. Chakravarthy. SnoopIB:
mining [30], and several algorithms have already been pro- Interval-based event specification and detection for
posed [26]. In contrast to the approaches mentioned above, active databases. Data & Know. Eng., 59(1):139 –
iCEP adopts a totally different viewpoint explicitly con- 165, 2006.
ceived for CEP. Indeed, it solves the learning problem by [2] A. Adi and O. Etzion. Amit - the situation manager.
modelling rules and traces as set of constraints and system- The VLDB Journal, 13(2):177–203, 2004.
atically computing their intersections. Moreover, existing
[3] J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman.
solutions lack the expressiveness of iCEP, which is designed
Efficient pattern matching over event streams. In
to automatically discover multiple rule features.
SIGMOD, pages 147–160. ACM, 2008.
As mentioned in Section 5, the algorithms currently imple-
[4] M. R. Álvarez, P. Félix, P. Cariñena, and A. Otero. A
mented in iCEP are not well suited to deal with application
data mining algorithm for inducing temporal
fields in which primitive events are produced by periodic
constraint networks. In Computational Intelligence for
readings of a certain value, e.g., the price of a stock, or the
Knowledge-Based Systems Design, pages 300–309.
temperature read by a sensor. Indeed, these settings often
Springer, 2010.
require mechanisms to learn temporal trends. We plan to
integrate iCEP with trend detection mechanisms, such as [5] D. Anicic, P. Fodor, S. Rudolph, R. Stühmer,
those described in [11, 55]. N. Stojanovic, and R. Studer. Etalis: Rule-based
Several approaches have been proposed to construct tem- reasoning in event processing. Reasoning in
poral rules in noisy environments. For example, [29] Event-Based Distributed Systems, pages 99–124, 2011.
adopts Markov logic networks to construct knowledge bases, [6] A. Arasu, S. Babu, and J. Widom. The cql continuous
while [4] proposes a data mining algorithm for inducing tem- query language: semantic foundations and query
poral constraint networks. Interestingly, [56] addresses the execution. The VLDB Journal, 15(2):121–142, 2006.
problem of unusual event detection and discusses an ap- [7] A. Artikis, O. Etzion, Z. Feldman, and F. Fournier.
proach based on a semi-supervised Hidden Markov Mod- Event processing under uncertainty. In DEBS, 2012.
els combined with Bayesian theory. Finally, the problem of [8] A. Artikis, G. Paliouras, F. Portet, and A. Skarlatidis.
learning rules from data traces has been also addressed in Logic-based representation, reasoning and machine
various domains, such as medical applications [41, 14], de- learning for event recognition. In ACM DEBS, pages
tection malicious intrusions in software systems [48, 34], and 282–293. ACM, 2010.
mining frequent patterns in alarm logs [25]. [9] A. Artikis, M. Weidlich, A. Gal, V. Kalogeraki, and
D. Gunopulos. Self-adaptive event recognition for
7. CONCLUSIONS intelligent transport management. 2013.
In this paper, we addressed the problem of automated [10] B. Babcock, S. Babu, M. Datar, R. Motwani, and
rule generation for CEP systems. We precisely defined the J. Widom. Models and issues in data stream systems.
problem and proposed a general approach that builds on In PODS, pages 1–16. ACM, 2002.
three fundamental pillars: decomposition into sub-problems, [11] D. J. Berndt and J. Clifford. Advances in knowledge
modularity of the solution, and ad-hoc learning algorithms. discovery and data mining. chapter Finding patterns
Moreover, we presented iCEP, a concrete implementation in time series: a dynamic programming approach,
of our approach, which effectively trades off performance pages 229–248. American Association for Artificial
and expressiveness. On the one hand, iCEP supports all Intelligence, 1996.
the operators commonly offered by the state of the art CEP [12] L. Brenna, A. Demers, J. Gehrke, M. Hong, J. Ossher,
systems. On the other hand, it can analyze thousands of B. Panda, M. Riedewald, M. Thatte, and W. White.
traces in a few minutes. In addition, the highly modular ar- Cayuga: a high-performance event processing engine.
chitecture of iCEP enables extensibility and customization. In SIGMOD, pages 1100–1102. ACM, 2007.
An extensive evaluation of iCEP, based on synthetical as [13] K. Broda, K. Clark, R. M. 0002, and A. Russo. Sage:
well as real data, demonstrated its benefits in terms of re- A logical agent-based environment monitoring and
call and precision. As future work we plan to deploy iCEP in control system. In AmI, pages 112–117, 2009.
real a real world application to further challenge its concrete [14] G. Carrault, M.-O. Cordier, R. Quiniou, and F. Wang.
applicability and flexibility. Temporal abstraction and inductive logic
To conclude, while CEP systems have proved their ef- programming for arrhythmia recognition from
fectiveness and efficiency in a wide range of scenarios, the electrocardiograms. AI in Medicine, 28(3):231–263,
complexity of rule definition still represents a challenge that 2003.
received little attention. We hope that, by building on top of [15] C. Cortes and V. Vapnik. Support-vector networks.
our contribution, and by exploiting and refining our frame- Machine Learning, 20(3):273–297, 2013.
work, future research could simplify the use of CEP systems [16] G. Cugola and A. Margara. Tesla: a formally defined
and promote their adoption. event specification language. In DEBS, pages 50–61.
ACM, 2010.
Acknowledgment [17] G. Cugola and A. Margara. Complex event processing
This research has been funded the Dutch national program with t-rex. Journal of Systems and Software,
COMMIT and by the European Commission, Programme 85(8):1709 – 1728, 2012.
IDEAS-ERC, Project 227977-SMScom and by Programme
FP7-PEOPLE-2011-IEF, Project 302648-RunMore.
[18] G. Cugola and A. Margara. Low latency complex event detection rules with noise hidden markov models. In
processing on parallel hardware. Journal of Parallel NASA/ESA Conf. on AHS, pages 159–166. IEEE,
and Distributed Computing, 72(2):205–218, 2012. 2012.
[19] G. Cugola and A. Margara. Processing flows of [39] Oracle cep.
information: From data stream to complex event http://www.oracle.com/technologies/soa/complex-
processing. ACM Comput. Surv., 44(3):15:1–15:62, event-processing.html,
2012. 2013.
[20] A. J. Demers, J. Gehrke, M. Hong, M. Riedewald, and [40] B. Padmanabhan and A. Tuzhilin. Pattern discovery
W. M. White. Towards expressive publish/subscribe in temporal databases: A temporal logic approach. In
systems. In EDBT, pages 627–644, 2006. KDD, pages 351–354, 1996.
[21] Y. Diao, B. Li, A. Liu, L. Peng, C. Sutton, T. T. L. [41] R. Quiniou, L. Callens, G. Carrault, M.-O. Cordier,
Tran, and M. Zink. Capturing data uncertainty in E. Fromont, P. Mabo, and F. Portet. Intelligent
high-volume stream processing. In CIDR, 2009. adaptive monitoring for cardiac surveillance. In
[22] Y. Engel and O. Etzion. Towards proactive Computational Intelligence in Healthcare 4, pages
event-driven computing. In DEBS, pages 125–136. 329–346. Springer, 2010.
ACM, 2011. [42] L. Rabiner and B. Juang. An introduction to hidden
[23] Esper, http://esper.codehaus.org/, 2013. markov models. ASSP Magazine, IEEE, 3(1):4–16,
[24] O. Etzion and P. Niblett. Event Processing in Action. 1986.
Manning Publications Co., 2010. [43] M. Rajesh Khanna and M. Dhivya. A generic
[25] F. Fessant, F. Clérot, and C. Dousson. Mining of an framework for deriving and processing uncertain
alarm log to improve the discovery of frequent events in rule-based systems. In ICICES, pages
patterns. In Advances in Data Mining, pages 144–152. 398–403. IEEE, 2013.
Springer, 2005. [44] C. Ré, J. Letchner, M. Balazinksa, and D. Suciu.
[26] T. Gärtner, P. A. Flach, A. Kowalczyk, and A. J. Event queries on correlated probabilistic streams. In
Smola. Multi-instance kernels. In ICML, pages SIGMOD, pages 715–728. ACM, 2008.
179–186, 2002. [45] J. Roddick and M. Spiliopoulou. A survey of temporal
[27] V. Guralnik and J. Srivastava. Event detection from knowledge discovery paradigms and methods. IEEE
time series data. In ACM SIGKDD, pages 33–42. TKDE, 14(4):750–767, 2002.
ACM, 1999. [46] N. P. Schultz-Møller, M. Migliavacca, and P. Pietzuch.
[28] D. Gyllstrom, J. Agrawal, Y. Diao, and N. Immerman. Distributed complex event processing with query
On supporting kleene closure over event streams. In rewriting. In DEBS, pages 4:1–4:12. ACM, 2009.
ICDE, pages 1391–1393, 2008. [47] S. Sen, N. Stojanovic, and L. Stojanovic. An approach
[29] S. Kok and P. Domingos. Learning markov logic for iterative event pattern recommendation. In ACM
networks using structural motifs. Fürnkranz, J., DEBS, pages 196–205. ACM, 2010.
Joachims, T.(eds.), 951:551–558, 2010. [48] C. Sinclair, L. Pierce, and S. Matzner. An application
[30] H.-P. Kriegel, K. M. Borgwardt, P. Kröger, of machine learning to network intrusion detection. In
A. Pryakhin, M. Schubert, and A. Zimek. Future IEEE ACSAC, pages 371–377. IEEE, 1999.
trends in data mining. Data Mining and Knowledge [49] Streambase, http://www.streambase.com/, 2013.
Discovery, 15(1):87–97, 2007. [50] Y. Turchin, A. Gal, and S. Wasserkrug. Tuning
[31] S. Laxman and P. Sastry. A survey of temporal data complex event processing rules using the
mining. Sadhana, 31:173–198, 2006. prediction-correction paradigm. In ACM DEBS,
[32] G. Li and H.-A. Jacobsen. Composite subscriptions in page 10. ACM, 2009.
content-based publish/subscribe systems. In [51] F. Wang and P. Liu. Temporal management of rfid
Middleware, pages 249–269. Springer-Verlag New data. In VLDB, pages 1128–1139, 2005.
York, Inc., 2005. [52] S. Wasserkrug, A. Gal, and O. Etzion. A model for
[33] D. C. Luckham. The Power of Events: An reasoning with uncertain rules in event composition
Introduction to Complex Event Processing in systems. In UAI, pages 599–606, 2005.
Distributed Enterprise Systems. Addison-Wesley [53] S. Wasserkrug, A. Gal, O. Etzion, and Y. Turchin.
Longman Publishing Co., Inc., 2001. Complex event processing over uncertain data. In
[34] M. V. Mahoney and P. K. Chan. Learning rules for DEBS, pages 253–264. ACM, 2008.
anomaly detection of hostile network traffic. In IEEE [54] S. Wasserkrug, A. Gal, O. Etzion, and Y. Turchin.
ICDM, pages 601–604. IEEE, 2003. Efficient processing of uncertain events in rule-based
[35] H. Mannila, H. Toivonen, and A. Inkeri Verkamo. systems. IEEE Trans. on Knowl. and Data Eng.,
Discovery of frequent episodes in event sequences. 24(1):45–58, 2012.
Data Mining and Know. Disc., 1:259–289, 1997. [55] A. Weigend, F. Chen, S. Figlewski, and S. Waterhouse.
[36] A. Margara, G. Cugola, and G. Tamburrelli. Towards Discovering technical traders in the t-bond futures
Automated Rule Learning for Complex Event market. In KDD, pages 354–358. Citeseer, 1998.
Processing, 2013. [56] D. Zhang, D. Gatica-Perez, S. Bengio, and
[37] G. Mühl, L. Fiege, and P. Pietzuch. Distributed I. McCowan. Semi-supervised adapted hmms for
Event-Based Systems. Springer-Verlag, 2006. unusual event detection. In IEEE CVPR, volume 1,
[38] C. Mutschler and M. Philippsen. Learning event pages 611–618. IEEE, 2005.

View publication stats

You might also like