Adding A "small data" Block Allocation Mechanism to HDF5

Quincey Koziol
koziol@ncsa.uiuc.edu
June 5, 2002

  1. Document's Audience:

  2. Background Reading:

  3. Motivation:

    What Is a Block Allocation Mechanism?

    A block allocation mechanism is a method the library uses to group "small" objects together, in order to increase I/O performance. The small objects can often be accessed together in less read or write calls.

    This is currently implemented in the HDF5 library by a "metadata block aggregation" algorithm which allocates a fixed-size block for "generic" metadata, and then preferentially sub-allocates small metadata allocations from that larger block.

    What Is "small data"?

    "Small data" is the "raw" dataset data stored in the file that is "small". "Small" in this case means it is similar in size to the typical size of metadata stored in the file. [Normally, this tends to be around 200-300 bytes in size.]

    Why Add a "small data" Block Allocation Mechanism?

    When the size of raw data for datasets is on the same order as the size of metadata in the file (or smaller), the allocation behavior of the raw data is similar to that of the metadata in the file and benefits from the same block allocation algorithm. The metadata in a file is already handled by a block allocation mechanism, but the "small data" in the file would benefit from a separate block allocation mechanism. An API function is provided to adjust the fixed-size block for "small data" up or down from its default setting of 2KB.

    Will Adding a "small data" Block Allocation Mechanism Hurt "large data" I/O?

    No. This block allocation mechanism is only used when raw data is small enough to fit into the current allocation block. Raw data that is larger than the space available in the block is allocated within the file in the normal manner.

  4. Feature's Primary Users:

    HDF5 Applications Which Create Small Datasets
    This feature primarily benefits applications which create many small datasets in a file.
  5. Design Goals & Requirements:

  6. Proposed Changes and Additions to Library Behavior:

    This proposed change adds a block allocation mechanism for "small data" in the file to the library. This is done in a manner nearly identical to the metadata block allocation mechanism in the library, which operates as follows:

    When space for metadata is requested, the block allocator checks if the space requested is small enough to fit into it's current block. If the block is sufficiently large, the space is sub-allocated from the block and the information about the block is updated. If the space requested is larger than the space remaining in the current block, but smaller than the block's initial size, the remaining space in the block returned to the file's free list, a new block is allocated and the requested space is sub-allocated from the new block (and the information about the block is updated). If the space requested is larger than the block's initial size, space is allocated specifically for the new metadata and the block is unchanged.

    The "small data" block allocation mechanism would operate in the same fashion, although operating on raw data instead of metadata.

  7. API Changes Required:

    Adding this mechanism to the library requires a pair of "get/set" property list functions to adjust the initial block size of the block used to sub-allocate "small" raw data from.

  8. Alternate Approachs and Other Enhancements:

    It might be useful to specify more than one size of block to sub-allocate out of, i.e. a block for allocation less than 2KB, another block for allocations from 2-64KB, etc., up to the final limit on the blocks where allocations above the limit at allocated directly from the file. Due to the way free space in the file is handled currently, this may be a bad idea though...

  9. Parallel I/O Repercussions:

    None. We require all space allocations in a file to be performed collectively, so all processes will make identical decisions about allocating from the "small data" block or from the file directly.

  10. Forward/Backward Compatibility Repercussions:

    Backward compatibility is the ability for applications using the HDF5 library to compile and link with future versions of the library. Forward compatibility is the ability for applications using the HDF5 library to compile and link with previous versions of the library.

    Adding this change has no forward or backward compatibility issues, since it is only adding new API functions and the behavior of current API functions does not change.

  11. File Format Changes:

    None.

  12. API Functions Added:


    NAME
    H5Pset_small_data_block_size
    PURPOSE
    Set the size of the block used to sub-allocate "small" dataset data from.
    USAGE
    herr_t H5Pset_small_data_block_size(fapl_id, size)
      hid_t fapl_id; IN: File access property list ID
      hsize_t size; IN: The maximum size of the block used to sub-allocate "small data" from.
    RETURNS
    Success: non-negative value
    Failure: negative value
    DESCRIPTION
    Set the size of the block used to sub-allocate "small" dataset data from. Setting the size parameter to 0 disables the "small data" block allocation mechanism.
    COMMENTS, BUGS, ASSUMPTIONS
    ?

    NAME
    H5Pget_small_data_block_size
    PURPOSE
    Get the size of the block used to sub-allocate "small" dataset data.
    USAGE
    herr_t H5Gget_small_data_block_size(fapl_id, size)
      hid_t fapl_id; IN: File access property list ID
      hsize_t * size; OUT: The maximum size of the block used to sub-allocate "small data" from.
    RETURNS
    Success: non-negative value
    Failure: negative value
    DESCRIPTION
    Retrieve the size of the block used to sub-allocate "small" dataset data.
    COMMENTS, BUGS, ASSUMPTIONS
    ?