Last modified: 10 September 2013
Tool Name: h5repack
Syntax:
h5repack [OPTIONS] in_file out_file

h5repack -i in_file -o out_file [OPTIONS]

Purpose:
Copies an HDF5 file to a new file with or without compression and/or chunking.

Description:
h5repack is a command line tool that applies HDF5 filters to an input file in_file, saving the output in a new output file, out_file.

Options and Parameters:
-i in_file
Input HDF5 file

-o out_file
Output HDF5 file

-h   or  --help
Print help message.

-v   or  --verbose
Print verbose output.

-V   or  --version
Print version number.

-n   or  --native
Use native HDF5 datatypes when repacking.
(Default behavior is to use original file datatypes.)
Note that this is a change in default behavior; prior to Release 1.6.6, h5repack generated files only with native datatypes.

-L   or  --latest
Use latest version of the HDF5 file format.

-c max_compact_links   or  --compact=max_compact_links
Set the maximum number of links, max_compact_links, that can be stored in a group header message (compact format).

-d min_indexed_links   or  --indexed=min_indexed_links
Set the minimum number of links, min_indexed_links, in the indexed format.

max_compact_links and min_indexed_links are closely related and the first must be equal to or greater than the second. In the general case, however, performance will suffer, possibly dramatically, if they are equal; performance can be improved by tuning the gap between the two values to minimize unnecessary thrashing between the compact storage and indexed storage modes as group size waxes and wanes. The relationship between max_compact_links and min_indexed_links is most important when group sizes are highly dynamic; that relationship is much less important in files with a stable structure. Compact mode is space and performance-efficient when groups have small numbers of members; indexed mode requires slightly more storage space, but provides increasingly better performance as the number of members in each group increases.

-m size   or  --minimum=size
Apply filter(s) only to objects whose size in bytes is equal to or greater than size.
size must be an integer greater than one ( 1 ).

Default:  If no size is specified, a threshold of 1024 bytes is assumed.

-u file   or  --ublock=file
Specify name of file containing user block data to be added.

-b user_block_size   or  --block=user_block_size
Set size in bytes of user block to be added.
user_block_size must be 512 or greater and a power of 2.

Default:  1024

-M size   or  --metadata_block_size=size
Metadata block size to be used when h5repack calls H5Pset_meta_block_size.
size must be a non-negative integer.

-t alignment_threshold   or  --threshold=alignment_threshold
Set threshold value for H5Pset_alignment call.
alignment_threshold must be an integer.

-a alignment   or  --alignment=alignment
Set alignment value for H5Pset_alignment call.
alignment must be a positive integer.

-s min_size[:header_type]   or  --ssize=min_size[:header_type]
Set the minimum size of optionally specified types of shared object header messages.

min_size is the minimum size, in bytes, of a shared object header message. Header messages smaller than the specified size will not be shared.

header_type specifies the type(s) of header message that this minimum size is to be applied to. Valid values of header_type are any of the following:
  dspace  for dataspace header messages
  dtype   for datatype header messages
  fill    for fill values
  pline   for property list header messages
  attr    for attribute header messages
If header_type is not specified, min_size will be applied to all header messages.

-f filter   or  --filter=filter
Filter type

filter is a string of the following format:

list_of_objects : name_of_filter[=filter_parameters]

list_of_objects is a comma separated list of object names meaning apply the filter(s) only to those objects. If no object names are specified, the filter is applied to all objects.

name_of_filter can be one of the following:
     GZIP, to apply the HDF5 GZIP filter (GZIP compression)
     SZIP, to apply the HDF5 SZIP filter (SZIP compression)
     SHUF, to apply the HDF5 shuffle filter
     FLET, to apply the HDF5 checksum filter
     NBIT, to apply the HDF5 N-bit filter
     SOFF, to apply the HDF5 scale/offset filter
     UD, to apply a user-defined filter
     NONE, to remove any filter(s)

filter_parameters conveys optional compression information:
     GZIP=deflation_level from 1-9
     SZIP=pixels_per_block,coding_method
         pixels_per_block is a even number in the range 2-32.
         coding_method is EC or NN.
     SHUF (no parameter)
     FLET (no parameter)
     NBIT (no parameter)
     SOFF=scale_factor,scale_type
         scale_factor is an integer.
         scale_type is either IN or DS.
     UD=filter_id,nfilter_params,value_1[,value_2,....,value_n]
         filter_id is the filter identifier.
         nfilter_params is the number of filter parameters.
         value_1 through value_n are the values of each filter parameter.
                 Number of values must match the value of nfilter_params.
     NONE (no parameter)

-l layout   or  --layout=layout
Layout type

layout is a string of the following format:

list_of_objects : layout_type[=layout_parameters]

list_of_objects is a comma separated list of object names, meaning that layout information is supplied for those objects. If no object names are specified, the layout is applied to all objects.

layout_type can be one of the following:
     CHUNK, to apply chunking layout
     COMPA, to apply compact layout
     CONTI, to apply contiguous layout

layout_parameters is present only in the CHUNK case and specifies the chunk size of each dimension in the following format with no intervening spaces:
     dim_1 × dim_2 × ... dim_n

-e file   or  --file=file
File containing values to be passed in for the -f (or --filter) and -l (or --layout) options.
This file contains only the filter and layout flags.

-S fs_strategy   or  --fs_strategy=fs_strategy
The type of file space management strategy to use for the output file

fs_strategy is a string as listed below:
     ALL_PERSIST: Use persistent free-space managers, aggregators and virtual file driver for file space allocation
     ALL: Use non-persistent free-space managers, aggregators and virtual file driver for file space allocation
     AGGR_VFD: Use aggregators and virtual file driver for file space allocation
     VFD: Use virtual file driver for file space allocation

-T fs_threshold   or  --fs_threshold=fs_threshold
The free-space section threshold to use for the output file.

fs_threshold is the minimum size (in bytes) of free-space sections to be tracked by the library's free-space managers.

in_file
Input HDF5 file

out_file
Output HDF5 file

Exit Status:
0 Succeeded.
>0     An error occurred.
Examples:
  1. h5repack -f GZIP=1 -v file1 file2
    Applies GZIP compression to all objects in file1 and saves the output in file2. Prints verbose output.
     
  2. h5repack -f dset1:SZIP=8,NN file1 file2
    Applies SZIP compression only to object dset1.
     
  3. h5repack -l dset1,dset2:CHUNK=20x10 file1 file2
    Applies chunked layout to objects dset1 and dset2.
     
  4. h5repack -f UD=307,1,9 file1 file2
    Adds bzip2 filter to all datasets.

History:
Release     Change
1.10.0 Options added in this release for file space management:
    -S, -fs_strategy
    -T, -fs_threshold
1.8.12 Added user-defined filter parameter (UD) to -f filter, --filter=filter option for use in read and write operations.
1.8.9 -M number, --medata_block_size=number option introduced in this release.
1.8.1 Original syntax restored; both the new and the original syntax are now supported.
1.8.0 h5repack command line syntax changed in this release.
1.6.2 h5repack introduced in this release.