Ellexus is the I/O profiling company and we are often asked how our tools compare to Darshan. Darshan is a popular open source tool for profiling MPI I/O in high performance computing environments. In this comparison we look at some of the benefits of Darshan and how it compares with our tools.
At the highest level, Darshan can be compared with our Breeze and Mistral product suites. Our tools have a wider range of functionality, but for simply collecting I/O stats, they align quite nicely. Breeze is detailed, but has a high overhead. Mistral is lightweight and can be run in production to find I/O issues, but doesn’t contain as much information. We find that Darshan sits somewhere in between with a particular focus on providing information about the frequency and performance of MPI operations.
The table below gives an overview of functionality and features within the I/O profiling space:
|Ellexus Breeze||Darshan||Ellexus Mistral|
|Per file data||Detailed||Overview||Limited|
|List of dependencies||MPI and Posix files, programs, libraries, network connections||MPI files only||No|
|Time series data||Yes||Limited||Yes|
|Independent vs shared files||Yes||Yes||No|
|– Read, write and seek||Yes||Yes||Yes|
|– Create, delete||Yes||No||Yes|
|Whole system overview||No||No||Yes|
|QoS and I/O throttling||No||No||Yes|
|Time spent in I/O||Yes||Yes||Yes|
|I/O operation performance||Yes||Limited||Yes|
|High-level user overview||Yes||Yes||Yes|
|Expertise required||High/Low options||High||Medium|
Which tool should you pick?
In summary, depending on what you want to see, Darshan might be the tool for you. A lot of the detail is there in the data and the range of graphs is very good.
Where it falls down in places is in usability for developers who are not experts in I/O profiling and in the accessibility of the data. Darshan is great for some stats, but even after spending a lot of time working with the tool we are still not sure what some of the reports are telling us. The pre-generated reports give a lot of information, but there doesn’t seem to be a way of getting at more detailed profiling data under the hood.
Looking at the raw data, you don’t get much more that the graphical reports. What we’d like to see is more explanation about why some of the graphs have been developed. Presumably someone had a need for that data at some point and others can learn from that use case.
In contrast, we have designed our tools to suit a range of levels of expertise, to make sure the information provided is accessible.
A further comparison is that while Darshan gives some really useful information for people looking to optimise MPI applications, it is quite limited outside of this scope because it excludes most Posix I/O outside of the MPI context. Since our tools were designed to protect against problems with all disk and network I/O, Darshan adds some complementary MPI-specific insights, but covers a much smaller problem space.
Get in touch if you’d like to know more about our tools.