Reading or writing dataset elements

Performing I/O on dataset elements in one of the more complex aspects of the HDF5 library (as you might expect from an I/O library) and has multiple components and layers, which are only partially documented here.  Each different method of storage for a dataset (contiguous, chunked, etc) is handled in a somewhat different manner, with some shared components or layers, so the common aspects are described in one section and then the special aspects of each storage method are documented in seperate sections.


When an MPI application performs I/O on dataset elements, the default action is to perform independent I/O operations from each process.  When independent I/O is performed by the MPI application, the underlying MPI-IO or MPI-POSIX VFD is eventually used to perform the read or write operation, but until that point, the I/O operation is handled identically to how I/O is handled for non-MPI applications.   Since there are no special actions taken to use the MPI interface until the appropriate VFD is reached in this case, it is not discussed further here, except to note that under certain circumstances (described in context in other locations in this document set), a collective I/O operation may be initiated by an application, but the HDF5 library "breaks" the collective access into independent access in order to perform the I/O operation.


When an MPI application has opened a file using the MPI-IO VFD, collective I/O operations on dataset elements may be performed.  The application must collectively create a dataset transfer property list (DXPL), call H5Pset_dxpl_mpio on that DXPL to set the I/O transfer mode to H5FD_MPIO_COLLECTIVE, and then use that DXPL in calls to H5Dread or H5Dwrite in order to invoke collective I/O when accessing dataset elements.  Collective read and write I/O operations on dataset elements are generally handled identically by the HDF5 library except for the eventual read or write call to the VFD layer and so are treated identically by this document set, unless noted in context.


The actions for performing collective I/O are identical for all dataset storage types, with the correct routines to invoke for each storage method chosen by setting function pointers in the "I/O information" for the I/O operation:


The "multiple" I/O operations for collective I/O on each type of dataset storage method are further described here:


Page Index