Genomics pipelines can be really interested to monitor from an I/O perspective. They’re normally highly tuned, but – even so – real improvements can be made that can only be uncovered through I/O profiling.
We decided to team up with Alces Flight to quickly create an HPC cluster on Amazon Web Services (AWS) on which to trace a cloud-ready genomics pipeline. Alces Flight provides a fully-featured, scalable HPC environment for research and scientific computing, architected to operate on both cloud and bare-metal platforms and complete with job scheduler and applications.
The pipeline we picked took 10 hours to run, during which we ran the test. We used a container holding the target genomics pipeline to be profiled with Mistral and yield performance metrics to be fed back for continued pipeline improvement.
Download the full white paper to read more about the test and what we were looking for.
The benefits of optimisation
The team at Alces saw the I/O profiling challenge as a means to demonstrate the strengths of both Flight and Ellexus.
“We are just as concerned about optimisation and performance as Ellexus,” said Wil Mayers, Technical Director at Alces Flight. “To be able to create HPC clusters in minutes takes a huge burden off of the user who would otherwise be building from scratch.
“With applications, optimisation is everything. We find that, on average, Flight gets you about 80% there. You’d be surprised how much work goes into that last 20%. Adding Ellexus gets you closer to that elusive 100% by continuously looking for more ways to improve.”
Ellexus CEO Rosemary said of the test: “We found that changes to small writes and reads could bring about more time and cost savings in the long run. From this already highly tuned pipeline Mistral could see that room for improvement that we know will always exist in cloud HPC.
“We’re very pleased with the results and we look forward to bringing more speed and efficiency to the cloud in the future.”