HDF5 documents and links
Introduction to HDF5
HDF5 User Guide

H5IN: Indexing Interface

Index API Functions

The indexing API allows the user to create an index over a specific dataset and then query efficiently over that dataset.

The C Interfaces:

H5INcreate
H5INquery

Alphabetical Listing

H5INcreate
H5INquery

Name: H5INcreate

Signature:

hid_t H5INcreate( const char *grp_name, hid_t grp_loc_id, const hid_t property_list, hid_t data_loc_id,
const char *data_loc_name, const char *field_name, hsize_t max_mem_size hbin_type_t bin_type, hsize_t num_bins void * bins )

Purpose:

Creates an index.

Description:

H5INcreate creates an index, named field_name/data_loc_name, (for compound/atomic data types respectively), in the group grp_name, at location grp_loc_id, on the dataset data_loc_name at th location data_loc_id. If the dataset pointed to by the data_loc_name is of atomic, then the field_name attribute should be NULL, else if it is a compound data type, then the field_name decides on which field in the data type the index has to be created.

The parameters bin_type, num_bins, bins are used to define the binning strategy to be used in the indexing.

hbin_type_t	`H5IN_NO_BINS`	No Bins to be created
hbin_type_t	`H5IN_EW_BINS`	Equi-width bins to be used.
hbin_type_t	`H5IN_USER_BINS`	User provided bins to be used.

In the case where the user provided bins are to be used the bins variable is used to define the bins. No type checks are done for the bins. It is assumed that the bins are of the same type as the data that is being indexed.

The parameter property_list defines the properties of the dataset where the index is to be stored.

The parameter max_mem_size decides the maximum memory size that can be allocated for sorting. While most datasets can be stored in memory for sorting assuming that for all datasets would be incorrect. This parameter controls the amount of memory the user wants to allocate for the index creation.

Parameters:

const char *grp_name IN: The name of the group where the index is to be stored.

hid_t grp_loc_id IN: Location identifier used to locate the group where the index is stored.

hid_t property_list IN: The properties of the dataset where the index is stored.

hid_t data_loc_id IN: The location where the dataset containing the data to be indexed is.

const char *data_loc_name IN: Name of the dataset to be indexed.

const char *field_name IN: Name of the field to be indexed.

hsize_t max_mem_size IN: Maximum memory size that can be used during sorting of the index.

hbin_type_t bin_type IN: The type of binning to be used.

hsize_t num_bins IN: The number of bins to be used (if not required should be 0).

void *bins IN: The actual bins to be used (if not required should be NULL).

Returns:

Returns the identifier to the group where the index is tored if succesful, otherwise returns a negative value.

Name: H5INquery
Signature:: hid_t H5INquery(hid_t dataset, const char **keys, void *ubounds, void *lbounds, int nkeys )
Purpose:: To query a dataset.
Description:: Queries the given dataset over the keys provided. The current implementation just covers AND queries over the various keys, with individual range queries for each key. The parameter nkeys specifies the number of keys over which the query is done. If the dataset being queried does not have an index, the current implementation just returns an error. If the dataset is of atomic type query can be only on one key (has to be the datasetname), where as in case of compound datatype the query can be on multiple keys(has to be name of the fields on which the index is built). Again no type checking is done, it is assumed that the types of the bounds correspond to the type of the key.
Parameters:

hid_t dataset IN: Dataset being queried.

const char **keys IN: The name of the keys.

void *ubounds IN: The upper bounds for the keys.

void *lbounds IN: The lower bounds for the keus.

int nkeys IN: The number of keys.
Returns:: Returns valid identifier if successful; otherwise returns a negative value.

HDF5 documents and links
Introduction to HDF5
HDF5 User Guide

HDF Help Desk
Describes HDF5 Release 1.6.5, November 2005

const char *`grp_name`	IN: The name of the group where the index is to be stored.
hid_t `grp_loc_id`	IN: Location identifier used to locate the group where the index is stored.
hid_t `property_list`	IN: The properties of the dataset where the index is stored.
hid_t `data_loc_id`	IN: The location where the dataset containing the data to be indexed is.
const char *`data_loc_name`	IN: Name of the dataset to be indexed.
const char *`field_name`	IN: Name of the field to be indexed.
hsize_t `max_mem_size`	IN: Maximum memory size that can be used during sorting of the index.
hbin_type_t `bin_type`	IN: The type of binning to be used.
hsize_t `num_bins`	IN: The number of bins to be used (if not required should be 0).
void *`bins`	IN: The actual bins to be used (if not required should be NULL).

hid_t `dataset`	IN: Dataset being queried.
const char **`keys`	IN: The name of the keys.
void *`ubounds`	IN: The upper bounds for the keys.
void *`lbounds`	IN: The lower bounds for the keus.
int `nkeys`	IN: The number of keys.