Mistral’s ability to eliminate bottlenecks is one of its key uses. Many IT managers struggle to ensure that users aren’t taking up too much bandwidth, which will slow down the performance of the cluster for everyone.
Watch our video to see how Mistral can set usage limits.
Our lead developer Paul demonstrates how to use Mistral to implement a simple fair use policy on a small, five node, Slurm test cluster to maintain good quality of service (QoS).
In the video scenario, Slurm has been set up so that all jobs submitted on the cluster are protected by Mistral. We have two simple rules defined to monitor I/O and to implement low overhead quality of service. The first is a basic logging rule showing all the read/write bandwidth being used on /home. The second is a single I/O bandwidth throttling rule set very high to prevent any single user doing enough I/O to bring down the cluster, without limiting normal jobs in any way day-to-day.
By default Mistral treats each job independently, but here it has been configured to apply the configured limit to the aggregate total bandwidth for each user over all their jobs. A similar technique can be used to implement QoS by group or by project. You can also apply similar limits to implement a fair use policy on meta-data.