« Previous 1 2 3
Improved Performance with Parallel I/O
Paths
Parallel NetCDF
Another portable file format is NetCDF [8]. The current version 4 allows the use of the HDF5 file format. APIs for NetCDF include C, C++, Fortran, Python, Java, Perl, Matlab, Octave, and more.
As with HDF5, NetCDF has a parallel version, Parallel-NetCDF [9], which also uses MPI-IO. This version is based on NetCDF 3 and was developed by Argonne Labs. To implement parallel I/O with NetCDF 4, you need to use HDF5 capability and make sure HDF5 was built with MPI-IO.
Recommendations
If you have an application that handles I/O in a serial fashion and the I/O is a significant portion of your run time, you could benefit by modifying the application to perform parallel I/O. The fun part is deciding how you should do it.
I recommend you start very simply and with a small-ish number of cores. I would use the file-per-process approach in which each TP performs I/O to its own file. This solution is really only suitable for small numbers of TPs, but it is fairly simple to code; be sure to have unique file names for each TP. This approach places more burden on the pre-processing and post-processing tools, but the application itself will see better I/O performance.
The second approach I would take is to use a high-level library such Parallel HDF5. You can use MPI-IO underneath the library to get improved I/O performance, but it might require some tuning. The benefit of using a high-level library is that you get a common, portable format across platforms with some possible I/O performance improvement.
After using high-level libraries, I would say that using MPI-IO or confining I/O to one TP are your choices. Writing applications for MPI-IO can be difficult, but it also can reap the biggest I/O performance boost. Having one TP perform all of the I/O can be a little complicated as well, but it is a very common I/O pattern for parallel applications.
Don't be afraid of jumping into parallel I/O with both feet, because you can get some really wonderful performance improvements.
Infos
- Amdahl's Law: http://www.admin-magazine.com/HPC/Articles/Failure-to-Scale
- MPI: https://computing.llnl.gov/tutorials/mpi/
- MPI and MPI-IO training tutorial: https://www.hpc.ntnu.no/display/hpc/MPI+and+MPI+IO+Training+Tutorial
- MPI-IO: http://beige.ucs.indiana.edu/I590/node86.html
- Liao, Wei-keng, and Rajeev Thakur. MPI-IO. In: High Performance Parallel I/O , chapter 13. Chapman and Hall/CRC, 2014, pp. 157-167, http://www.mcs.anl.gov/papers/P5162-0714.pdf
- HDF5: http://www.hdfgroup.org/
- Parallel HDF5: https://www.hdfgroup.org/HDF5/PHDF5/
- NetCDF: https://en.wikipedia.org/wiki/NetCDF
- Parallel-NetCDF: https://en.wikipedia.org/wiki/NetCDF#Parallel-NetCDF
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)