PARALLEL

Type of Work

Research


Item

Flexible parallel HDF5


Explanation

Under certain circumstance, parallel I/O in hdf5 is very effective, but because of the complex data and metadata structures in HDF5, parallel can also be cumbersome and slow. Particularly difficult is the need for small metadata writes that can accompany the efficient large write data writes. A project called "flexible parallel HDF5," which used a set-aside process for managing data and metadata I/O, did address this problem, but was not very successful in achieving its goals of simpler and more efficient I/O. Other proposals have been put forward, but we have as yet not had the funding to pursue these. See the associated documents for a description of flexible parallel HDF5, and subsequent ideas. These ideas include: Complete the set-aside process work that has not been completed; make it work for chunked storage; add the ability to update dataset attributes; implement collective multiple dataset open; create 1 dataset per node in parallel; change API to allow flexible, parallel HDF5, so it doesn't need collective calls.


Documentation

Folder: Flexible parallel HDF5