10X Improvement in Dashboard Query Performance By Re-architecting Data Pipelines
Streamlining ETL pipelines to improve query performance and ensuring better forecasting and predictive maintenance of machines
Business Challenges
The customer is a leading manufacturer of industrial tools, household hardware, and security systems with thousands of machines installed across plants that automate processes, including welding and bolting. These machines produced IoT data such as current utilized, voltage fluctuations from the sensors installed, and collected on a centralized IoT server. The customer monitored this data on a central dashboard built in-house, however, the dashboard was slow and not scalable, resulting in delays in generating insights and predicting machine failures.
The customer wanted to enable:
- Central monitoring and diagnosis of machines across manufacturing factories
- A solution to better forecast the probability of downtime and malfunctioning of these machines in a cost-efficient way. It would save them millions of dollars in preventing product recall due to defective manufacturing
- Near real-time updates of dashboards
Sigmoid Solution
We addressed the customer’s needs by building an ETL data pipeline and developed Spark-based data lake and high-speed dashboard to enable faster query processing time. Leveraging open-source technologies, we integrated and scaled the prediction and forecasting layers into the new solution while automating the deployment.
A centralized IoT MQTT Broker collected IoT data such as telemetry and faults from sensors across factories. The raw data was enriched, processed, and harmonized before writing it to the S3 data lake, which was used for monitoring and analysis. These pipelines employed machine learning scripts to predict and forecast events. We used NitroDb — Sigmoid’s proprietary platform and query accelerator — to create ML models for predictions.
Business Impact
We built an efficient and near real-time ETL data pipeline for the customer that resulted in 10X improvement in query processing time and increased scalability. We facilitated centralized monitoring of dashboards and reduced time to reflect data on dashboards from 3 days to less than an hour. The solution also ensured 2X reduction in time to market new monitoring features while keeping the cost of deployment minimal.