Ellexus and AWS CloudWatch provide integrated infrastructure and application monitoring for scientific workloads and ML

Analytics are both an advantage and a necessity in the cloud. Cost management starts with the right sizing of compute and storage while the rapid prototyping enabled by hybrid cloud lets you stay ahead of the curve as applications and workloads evolve. Only by choosing the right monitoring solutions can you have the insight you need to make good decisions in an environment as rich in choices as hybrid cloud.

Ellexus provides hybrid cloud monitoring solutions that understand the complexities of high-performance and high-throughput computing (HPC and HTC). With unrivalled real-time insight into I/O, compute and memory patterns that are system and storage agnostic, Ellexus gives HPC organisations the visibility they need to make hybrid cloud a success.

Amazon CloudWatch is a monitoring and management service that provides data and actionable insights for AWS, hybrid and on-premises applications and infrastructure resources. By combining the infrastructure metrics from CloudWatch with the application monitoring from Ellexus Mistral, customers can unlock the potential of an elastic compute resource.

Putting Mistral metrics into CloudWatch

If you use CloudWatch dashboards you will want to see Ellexus Mistral data alongside the AWS metrics that are already available. For this Mistral integrates with the AWS PutMetricData API to directly inject data into the CloudWatch log framework. All the same CloudWatch functionality will be available at up to one-second frequency for the Mistral data visible through CloudWatch dashboards, statistics, graphs and alarms.

Unifying Mistral data with CloudWatch metrics in Elasticsearch

For customers who are seeking a richer dashboard or data pipeline experience, there are many third-party applications and services that can provide anything from a simple operational dashboard to a full ML-powered data pipeline for business intelligence and forecasting.

Elasticsearch is increasingly becoming the standard for handling time series data and both Mistral and Elasticsearch can stream to the Amazon Elasticsearch Service or similar databases. Mistral comes with a plugin that just needs to be told where the Elasticsearch database is and CloudWatch has streaming services that can also handle near real time feeds.

Elasticsearch displaying Mistral data

If Mistral is also deployed to an on-premise compute cluster, the data can still be unified in a hosted database. For scalability we recommend pushing the data to a dedicated log buffering agent such as FluentBit and then using a service such as Kafka to provide a fan-in fan-out architecture that sits on-premise and in the cloud. Amazon also has a managed Kafka service that makes it easy to stand up a scalable data pipeline while maintaining open source compatibility.

Intelligent analytics

Most organisations will uncover a lot of easy wins when migrating to the cloud by analysing the Mistral and CloudWatch metrics and tuning the cloud infrastructure accordingly. For longer-term efficiency, those operating at sufficient scale will want to deploy a machine learning data pipeline to unify high-level business goals regarding cost and engineering efficiency with auto-tuning the hybrid cloud environment. This agile way of working and model of proactive rapid deployment is where the cloud wins over heterogeneous on-prem resources.

Amazon SageMaker is a fully-managed machine learning service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimise it for deployment, make predictions and take action. Customers gain complete control over the data, the trained models and the outcomes, future proofing the infrastructure investment and again exploiting the open source community to provide a customised HPC environment with off-the-shelf components.

For more information about how Ellexus can help you to optimise you hybrid cloud infrastructure, get in touch.