In a recent “Internet of Things” project we helped a company in transforming their business and to provide optimized and new services. If you are not familiar with the megatrend “Internet of Things” I would recommend this Microsoft Webcast.
The company named Dr. A. Kuntze is based in Germany and is a specialist for innovative solutions in measurement and control technology for water analysis. Their water sensors and control systems are in use worldwide in a wide range of scenarios including food & beverage industry.
Up to the project consuming the sensor data without being on site involved VPN networks – now it is possible to consume the data in Real-Time anytime and anywhere including mobile. We implemented the classic Big Data architecture called “Lambda Architecture” that involves a Real-Time and a Batch Load which is the basis for the further step of adding Prediction and related services on top. All of that of course making use of the Microsoft Cloud called Azure.
We implemented the Real-Time stream first, making use of the Azure Event Hub and Azure Stream Analytics and Power BI. See my last blog post for a video showing the components and the moving Real-Time Power BI dashboard.
For the batch load we gathered the (Real-Time) transmitted JSON Files in a Azure Blob Storage and transformed them in a nightly batch job with Azure HDInsight – Microsofts adaption of the Hortonworks Hadoop distribution which can (in Azure) be used as a service.
We provision the HDInsight cluster every night just for the time of the transformation and drop it right after successful transformation of the sensor data, which allows a cost optimized approach. The schema flexible Hadoop is the platform of choice for polystructured input files as we have, different end-customers have different amount of different sensors, measuring different things with different number of data points.
The transformed data is then presented in Power BI as well. The next graphic shows an high level architectural overview of the solution.
Both loads (Real-Time and Batch) are enriched with prediction services based on Azure ML (Machine Learning), in the Real-Time stream it’s about predictive alarming and in the batch load it’s about predictive maintenance and system optimization.
For the orchestration of the batch workflow we first started with Power Shell, which is a valid option but soon switched to Azure Data Factory, mainly because we wanted to connect on premise data sources (the reference data – which is basically the registered sensor master data) of Dr. A. Kuntze and the named cloud components. Further more with Azure Data Factory we have a nice graphical overview of the latest loads and detailed log information. A high level overview of the data factory workflow implemented can be seen here:
I’m planning to do some more posts on the used Azure components and their benefits, but for now I want to close this blog with the key benefits our customer Dr. A. Kuntze has with this architecture in place:
– End-customers and Data Scientists of Dr. A. Kuntze can consume the sensor data in Real-Time. Many of the end-customers (like in food & beverage industry) depend on strictly regulated water supply, with quality proven levels of different ingredients. Failures in the water supply can lead to complete breakdown of the facility and can be very costly.
– With Power BI as the presentation layer, there are lots of possibilities to consume the data including mobile delivery, browser based or based on the native Power BI apps.
– Real-Time is good, Predictive is better. The earlier an alarm about a potential critical situation is thrown, the better. The complex event processing in Azure Stream Analytics provides the possibility to do complex calculations in the data stream providing Real-Time Alarming and integration of Azure ML Prediction models for early detection even before a threshold alarm is fired.
– Data collection and transformation in the cloud with the involvement of Azure ML is a new kickstarter of services for the Data Scientists of Dr. A. Kuntze. Up to the project their work was accompanied with lots of travel necessities (to get to the data), difficult data gathering and resulted in lots of paper work for the result presentation. Now they have the complete end-customers data history in their hands, can search for patterns and derivatives with Hadoop and Azure ML, can present end-customers their work with individually prepared Power BI Dashboards with which the results and dependencies can presented in an interactive way
In short: new and quality improved services and business opportunities.
In the end the process we walked through reflect exactly what Microsoft CEO Satya Nadella proposed as a tip for companies in the upcoming age of the Internet of Things (IoT):
“Every industry is now a software industry where they are building these systems of intelligence. You all are transforming processes by digitizing things that were not digitized before …
Every one of these businesses now is going to become a software business. You’re going to reason over that data, you’re going to build applications, you’re going to do analytics and predictions. You’re going to provide SaaS services that go along with your products.
That’s going to change the gross margin to become more like software margins. That’s what the evolution to these systems of intelligence is going to bring in.”