Skip to content

Architecture

rmuthupandian edited this page Feb 11, 2015 · 1 revision

Architecture

#Screenshot

Jetstream is a Spring based framework for building distributed stream processing applications. Jetstream applications can be deployed in a cloud environment. Tooling is provided to package Jetstream applications into a docker file for cloud deployment. The framework also provides the tooling for building, managing and monitoring Jetstream applications. The framework has a loosely coupled integration with Esper which provides CEP capabilities. Jetstream provides tooling to hot deploy EPL (SQL like language provided by Esper). Currently several Jetstream applications have been successfully deployed in the eBay cloud. We have also deployed and tested some applications in EC2 environment.

Jetstream building blocks

#Screenshot

Channels and Processors are the building blocks of a Jetstream Application. They form the stages of a pipeline within the container. You build specialized channels and processors by extending the abstraction for channels and processors in the framework. There are several off the shelf channels and processors that are already built and provided by the framework for you to use.

Jetstream Application

#Screenshot

Jetstream application is built by creating a pipeline of interconnected stages forming a directed acyclic graph (DAG) of processing stages within the application. All the stages in an application are built as Spring beans and deployed in the Spring container. The pipeline wiring is specified in Spring XML and the pipeline is stitched at run time and not build time. The Spring XML can be sourced from configuration specifications located in classpath, filepath or network path. The config system in every application looks up these configuration repositories in the following order classpath followed by filepath followed by network path. Both classpath and filepath contain static configuration. The dynamic configuration is stored in network path. Jetstream uses Mongo DB to store network path configuration. Jetstream enables seamless changes to topology of a running application without requiring code roll out. This is driven through a configuration push.

Jetstream applications can be clustered using the clustering technology built in to the framework. Jetstream clusters communicate with each other using a pub/sub interface which enables seamless extension of the pipeline across clusters. This way the DAG extends beyond a single application. The cluster nodes can join and leave at will. They are auto discovered. Jetstream clustering provides self healing capabilities where by a loss of a node causes traffic to automatically rebalance and the aggregates are moved from failed node to a new node.

Data Model

Events are serialized as Pojos over the wire. Jetstream events are Maps carrying key value pairs. All pojo's carried by Jetstream events must implement the java Externalizable interface.

JetstreamEvent event - new JetstreamEvent();
event.put("Age", m_r.nextInt(75));

Typical Jetstream System

#Screenshot

A typical Jetstream system is made up of producer and consumer clusters. Jetstream requires a Zookeeper ensemble for a minimum. Consumers advertise themselves and producers discover consumers by listening to consumer advertisements. The advertisements travel through Zookeeper.

Jetstream application's dynamic configuration is stored in Mongo DB. A configuration manager application provides a Web interface for proivisioning applications. This application sends notification to all applications when a Mongo DB is updated with new configuration or configuration changes. This is only needed if the applications require dynamic configuration.

Clone this wiki locally