Proton On Storm

According to Gartner [1][2][3][4] and [5], two forms of stream processing software have emerged in the past 15 years. The first were CEP systems that are general purpose development and runtime tools that are used by developers to build custom, event-processing applications without having to re-implement the core algorithms for handling event streams; as they provide the necessary building blocks to build the event driven applications. Modern commercial CEP platform products even include adapters to integrate with event sources, development and testing tools, dashboard and alerting tools, and administration tools. More recently the second form — distributed stream computing platforms (DSCPs) such as Amazon Web Services Kinesis[1] and open source offerings including Apache Samza[2], Spark[3] and Storm[4] — was developed. DSCPs are general-purpose platforms without full native CEP analytic functions and associated accessories, but they are highly scalable and extensible and usually offer an open programming model, so developers can add the logic to address many kinds of stream processing applications, including some CEP solutions. Therefore, they are not considered “real” complex event processing platforms. Specially, Apache open source projects (Storm, Spark, and recently Samza) have gained a fair amount of attention and interest ([5], [6]) and these may well mature into commercial offerings in future and/or get embedded in existing commercial product sets. DSCPs are designed to cope with Big Data requirements making them an essential component in any organization infrastructure.

Today, there are already some implementations that take advantage of the pattern recognition capability of CEP systems along with the scalability capabilities that offer DSCPs, and offer a holistic architecture. ProtonOnStom is one example.

IBM partner has implemented its open source complex event processing research tool IBM PROactive Technology ONline (PROTON) on top of Storm in the scope of the FP7 EU FERARI project, thus making it a distributed and scalable CEP engine. ProtonOnStorm has been released to open source under the Apache 2.0 license. The source code along with manuals can be accessed at[5].

ProtonOnStorm is currently applied in two other EU projects for different purposes: in SPEEDD (FP7) is extended to cope with certain aspects of uncertainty while in Psymbiosys (H2020) is applied to different scenarios of manufacturing intelligence that use IoT devices. Both projects deal with Big Data.

Proton on STORM in topology
Proton on STORM in topology

ProtonOnStorm programming model serves as basis for the optimization plans that are done in the scope of the project. The derived events out of the system in the form of fraud alerts feed the dashboard of the project.

ProtonOnStorm is the underlying CEP engine in FERARI demo that has been accepted for SIGMOD16 and will be shown also as invited demo in DEBS16

[1] http://aws.amazon.com/kinesis/

[2] http://samza.incubator.apache.org/

[3] http://spark.apache.org/streaming/

[4] https://storm.apache.org/

[5] https://github.com/ishkin/Proton/