Real time data alerts using AWS Big Data Services: A point of view

Recently i implemented an in house Real Time Data Alert solution, which would provide real time insights of the oil fields to the operators. I used AWS Big Data Services to develop this solution.

While exploring the solution design and approach, i realized that the solution with little tweaks could well be cross-applied to other domains such as real time log analysis, stock price movements, anomaly detection, health indicators monitoring and may more.

In this article, i will discuss about a broad based solution architecture for all such Real time data alerts use cases.

Solution Architecture: Real time data alerts

The source systems could comprise of IOT sensors, Kinesis agents in EC2 instances of the Applications, existing Apache Kafka streams and so on.

Data from these streams need to be published to the Amazon Kinesis Data Streams service. Here , the number of shards, partitioning of the data needs to be thought through. This article explains it well enough. This is essential for avoiding hot partitions and throttling of the data in the shards.

Once the data is in the Streams, it is now time for analyzing it. AWS offers the SQL based interface, AWS Kinesis Data Analytics for this. I found this extremely easy to use. Based on the time window, say 10 seconds,during which we want to analyze the results, we can aggregate the streams using ‘group by’ and specify conditions using ‘where’ clause to determine whenever the threshold was crossed. It could be an increase in average temperature or average pressure during the time window, drastic increase in sales during the time window and so on.

Based on the analysis, the streams for the time window during which the threshold was crossed will be published to another Kinesis Data Streams, let us call it Alert Streams. This stream specifically has the records which have collectively crossed the threshold.

We will now use the AWS Lambda function to generate the Alert notification to end users.

Alert Streams trigger the Lambda function, which has a function to output a customized notification message to the end users. This message (mail, text messages and so on) is broadcasted using the AWS Simple Notification services.

The end users or operators who receive the alerts can then act accordingly and take necessary course corrections.

The entire solution is scalable and can be developed within our browser itself!

Certainly the devil lies in the details of the implementation and i will try and tame this in my next article on this topic where i will share the outputs of each step discussed above taking a sample order generating application. I shall also try and come up with an approach to estimate the efforts required to implement this solution, team composition and a draft project plan as well.

Stay tuned..

Data enthusiast with passion for building enterprise data solutions