This document describes the HDF5 file format changes related to the proposed dataset fill value changes. Please also refer to the proposal and API document for related information.
This document focuses on two modified file header message format. For the layout message, the “address” value has been changed to express an unallocated space. The fill value message has been added four new fields. Please read on to find more detailed information.
Type: 0x0008
Length: varies
Status: Required for datasets, may not be repeated.
Purpose and Description: Data layout describes how the elements of a multi-dimensional array are arranged in the linear address space of the file. Two types of data layout are supported:
byte |
byte |
byte |
byte |
Version |
Dimensionality |
Layout Class |
Reserved |
Reserved |
|||
Address |
|||
Dimension 0(4-bytes) |
|||
Dimension 1(4-bytes) |
|||
… |
Field Name Description
Version A version number for the layout message. This documentation describes version two. (A word about backward compatibility: To minimize risk, the version number is set to the value of two if dataspace is not allocated when dataset is created; the version number will be set to the value of one if dataspace is allocated when dataset is created.)
Dimensionality An array has a fixed dimensionality. This field specifies the number of dimension size fields later in the message.
Layout Class The layout class specifies how the other fields of the layout message are to be interpreted. A value of one indicates contiguous storage while a value of two indicates chunked storage. Other values will be defined in the future.
Address For contiguous storage, this is the address of the first byte of storage. This address is initialized to HADDR_UNDEF(-1) to indicate the storage space has not been allocated. For chunked storage this is the address of the B-tree that is used to look up the addresses of the chunks.
Dimensions For contiguous storage the dimensions define the entire size of the array while for chunked storage they define the size of a single chunk.
Type: 0x0005
Length: varies
Status: Optional, may not be repeated.
This fill value message stores a single data value(including compound data) and its related properties - space allocation time, fill value write time, and whether fill value is defined. Whether the fill value is written to dataset or returned to user depends on its properties. The fill value is interpreted as the same datatype as the dataset.
Byte |
byte |
byte |
byte |
Version |
Space allocate time |
Fill value write time |
Fill value defined |
Size(4-bytes) |
|||
Fill value |
Field
Name Description
Version A version number for the fill value message. This document describes version one.
Space allocate time When to allocate storage space. It specifies whether to allocate space as early as dataset is created (a value of one), or as late as user’s data is written to dataset (a value of two).
Fill value write time When to write fill value to dataset. A value of zero indicates never to write fill value; a value of one means to write fill value once storage space is allocated and write fill value to the entire dataset.
Fill value defined Whether fill value is defined. A value of zero means undefined; a value of one indicates defined(default or user-defined). If undefined, the “size” field will have the value of zero while the “fill value” field will not exist.
Size(4 bytes) This the size of the fill value field in bytes. If the fill value is compound type, this size will be the size of the whole compound datatype.
Fill value The actual fill value. The bytes of the fill value are interpreted using the same datatype as for the dataset.