Clients desire to have each write operation from a process in a parallel application perform only one low-level write operation to disk (i.e. only one call to MPI_File_write_at (or MPI_File_write_at_all). When using chunked storage for datasets, this can only be done if the data that each process writes out is exactly aligned with a single chunk.
For example, if the chunks for a dataset are regular and non-overlapping, and are defined like this:
![]() |
Each chunk in the dataset is stored contiguously on disk, like this: (note that the chunks themselves are not necessarily adjacent)
![]() |
Assume for the purpose of this example, that the size of the datatype for each element in the dataset is 4 bytes (making each chunk 256 bytes in size) in the file. Also, assume that the client is using 4 processes to access the dataset.
The following table describes the hyperslab selection for each client
process:
Client Process # | "Start" location | "End" location |
0 | (0, 0) | (7, 7) |
1 | (0, 8) | (7, 15) |
2 | (8, 0) | (15, 7) |
3 | (8, 8) | (15, 15) |
![]() |
When I/O is performed by each client process with these selections and
chunks defined, each process will perform only one I/O operation, as indicated
by the following table:
Client Process # | Number of I/O operations | # of Bytes In Selection | # of Bytes Transferred | Byte Transfer Efficiency |
0 | 1 | 256 | 256 | 100% |
1 | 1 | 256 | 256 | 100% |
2 | 1 | 256 | 256 | 100% |
3 | 1 | 256 | 256 | 100% |
Now, assume that the following selections have been made:
Client Process # | "Start" location | "End" location |
0 | (0, 0) | (9, 8) |
1 | (0, 9) | (6, 15) |
2 | (10, 0) | (15, 8) |
3 | (7, 9) | (15, 15) |
![]() |
Having mis-aligned file selections and chunks results in a much larger number
of I/O operations and bytes transferred to disk:
Client Process # | Number of I/O operations | # of Bytes In Selection | # of Bytes Transferred | Byte Transfer Efficiency |
0 | 4 | 360 | 1024 | 35.16% |
1 | 1 | 196 | 256 | 76.56% |
2 | 2 | 216 | 512 | 42.19% |
3 | 2 | 252 | 512 | 49.22% |
However, if chunks are allowed to be variable-sized, they could be defined like this:
![]() |
Each chunk in the dataset is still stored contiguously on disk, like this: (note that the chunks themselves are not necessarily adjacent)
![]() |
Then, the variable-sized selections for each client process (as defined above) would be aligned with the chunk boundaries:
![]() |
And would have the optimal number of I/O operations and bytes transferred
again:
Client Process # | Number of I/O operations | # of Bytes In Selection | # of Bytes Transferred | Byte Transfer Efficiency |
0 | 1 | 360 | 360 | 100% |
1 | 1 | 196 | 196 | 100% |
2 | 1 | 216 | 216 | 100% |
3 | 1 | 252 | 252 | 100% |