Award-winning product: Mistral
Ellexus has launched its ground breaking new tool for load balancing shared storage. Developed in collaboration with the IT department at ARM, Mistral is able to:
• Monitor application I/O and I/O performance across the cluster in order to identify rogue jobs and hotspots
• Load balance shared storage by automatically throttling I/O in problem jobs and applications
Mistral has already won the highly prestigious Product of the Year Award at the Cambridge Lab Ring Hall of Fame Awards 2016.
To find out more about Mistral and to request an evaluation, get in touch.
Solving the noisy neighbour problem
In a compute cluster with shared storage it is possible for a small number of jobs to overload the network or file system. This can affect the performance of all the jobs on the cluster and even bring it down completely. This is called the noisy neighbour problem.
Sometimes this problem is caused by rogue jobs that have been submitted to the cluster by mistake. Other times the cluster may simply by overloaded with a high number of I/O hungry jobs.
Mistral monitors application I/O and cluster performance so that jobs exceeding the expected I/O thresholds can be automatically identified and slowed down through I/O throttling.
Whole cluster I/O monitoring
Mistral can monitor application I/O by wrapping up the jobs on the compute node or by intercepting I/O traffic as it passes through an NFS gateway or SMB gateway.
Jobs with higher than expected I/O or higher than expected latency generate an alert.
Mistral monitors the number of read() and write() operations, the I/O bandwidth for reads and writes and the number of meta-data operations such as open() or stat().
Load balancing for shared storage
As well as monitoring the I/O to detect rogue jobs, Mistral is able to throttle the I/O of problem jobs so that the cluster can recover and all the remaining well-behaved jobs can continue with good performance.
High-priority jobs can be given high limits so that they get a large share of the storage bandwidth. Jobs that do unexpectedly high I/O will be throttled early so that the cluster is not affected.
Once you have built up a picture of how much I/O each job is expected to do, Mistral can build a history of I/O patterns for advanced tuning. This information can be logged by Mistral and used to educate your users and to redesign your work flows and software pipelines to better use your shared storage.