HDF5 File Open/Create/Close Revision

Draft revision: October 26, 2001

Revision Status

Final draft candidate. Solutions for issue 1, 2 and 3a are proposed.  These all involve just the revision of the Reference Manual entries.

Solution for issue 3b is quite extensive and is treated in a separated proposal.

Description

The current (v1.4.2) definitions of H5Fopen, H5Fcreate and H5Fclose (see appendix A) needs some clarification in documentation (as in Reference Manual) and revision of library functionalities.  The issues are presented and solutions proposed below.  The proposed revision are underlined and in red text.

What are the issues?

Issue 1

H5Fopen mentions of multiple opened file will return a unique file identifier for each H5Fopen call but it does not say if the library actually opens the file only once or multiple times.  It also does not mention it is not always possible to detect multiple opened file.  The library can detect multiple opens of the same file in some cases such as Unix local file systems (using major and minor devise numbers and inodes), Windows local drive.  It can not detect as such if it is a network file system (e.g., NFS) or in the case of MPI-IO access.

Solution

Need to revise H5Fopen with more explanation of the multiple opened file feature.

 

Name: H5Fopen

Signature:

hid_t H5Fopen(const char *name, unsigned flags, hid_t access_id )

Purpose:

Opens an existing file.

Description:

H5Fopen opens an existing file and is the primary function for accessing existing HDF5 files.

The parameter access_id is a file access property list identifier or H5P_DEFAULT for the default I/O access parameters.

The flags argument determines whether writing to an existing file will be allowed or not. The file is opened with read and write permission if flags is set to H5F_ACC_RDWR. All flags may be combined with the bit-wise OR operator (`|') to change the behavior of the file open call. The more complex behaviors of a file's access are controlled through the file-access property list.

Multiple open of the same file: a file which is opened more than once without closing first, will return a unique file identifier and can be accessed through all of them.  All the open calls should use the same flags argument. In some cases such as files on a local Unix file system, HDF5 can detect the same file is multiply opened and will maintain a coherence accesses among all the file identifiers.  But in many other cases such as parallel file systems or networked file systems, it is not always possible to detect the multiple opens of the same physical file.  In those cases, HDF5 will treat the file identifiers as accessing different files and cannot maintain a coherence access.  Errors are likely to result in these cases.  Applications should avoid multiple open of the same file.

The return value is a file identifier for the open file and it should be closed by calling H5Fclose() when it is no longer needed.

Parameters:

const char *name

Name of the file to access.

unsigned flags

File access flags. Allowable values are:

H5F_ACC_RDWR

Allow read and write access to file.

H5F_ACC_RDONLY

Allow read-only access to file.

·  H5F_ACC_RDWR and H5F_ACC_RDONLY are mutually exclusive; use exactly one.

·  An additional flag, H5F_ACC_DEBUG, prints debug information. This flag is used only by HDF5 library developers; it is neither tested nor supported for use in applications.

hid_t access_id

Identifier for the file access properties list. If parallel file access is desired, this is a collective call according to the communicator stored in the access_id. Use H5P_DEFAULT for default file access properties.

Returns:

Returns a file identifier if successful; otherwise returns a negative value.

Non-C API(s):


Issue 2

H5Fcreate does not mention the multiple opened file feature.  It does not explain what will happen if a file is already opened, either by H5Fopen or H5Fcreate, and then H5Fcreate is called with H5F_ACC_TRUNC.  I believe the current library will return failure if it can detect it is a multiple open of the same file (with the same limitation as in Issue 1).  If it can not detect the multiple opened condition, it will truncate the existing file.  This is likely an error.  On the other hand, this is an application error in issuing conflicting file open requests.

Solution

Need to revise H5Fcreate with more explanation of the multiple opened file feature.

 

 

Name: H5Fcreate

Signature:

hid_t H5Fcreate(const char *name, unsigned flags, hid_t create_id, hid_t access_id )

Purpose:

Creates HDF5 files.

Description:

H5Fcreate is the primary function for creating HDF5 files .

The flags parameter determines whether an existing file will be overwritten. All newly created files are opened for both reading and writing. All flags may be combined with the bit-wise OR operator (`|') to change the behavior of the H5Fcreate call.

The more complex behaviors of file creation and access are controlled through the file-creation and file-access property lists. The value of H5P_DEFAULT for a property list value indicates that the library should use the default values for the appropriate property list.

If a file being created, is already opened by a previous H5Fopen or H5Fcreate, HDF5 may or may not detect they are the same physical file (see H5Fopen about the limit of same file detection). If HDF5 detects the same file is already opened, H5Fcreate returns failure, whether H5F_ACC_TRUNC is used or not.  If HDF5 does not detect the same file is already opened and H5F_ACC_TRUNC is not used, it will return failure because the file already exists.  But if H5F_ACC_TRUNC is used, H5Fcreate will truncate the existing file and returns with a valid file identifier.  Such a truncation of a currently opened file most likely result in errors.  Applications should avoid applying H5Fcreate to an already opened file.

Parameters:                                                                              

const char *name

Name of the file to access.

uintn flags

File access flags. Allowable values are:

H5F_ACC_TRUNC

Truncate file, if it already exists, erasing all data previously stored in the file.

H5F_ACC_EXCL

Fail if file already exists.

·  H5F_ACC_TRUNC and H5F_ACC_EXCL are mutually exclusive; use exactly one.

·  An additional flag, H5F_ACC_DEBUG, prints debug information. This flag is used only by HDF5 library developers; it is neither tested nor supported for use in applications.

hid_t create_id

File creation property list identifier, used when modifying default file meta-data. Use H5P_DEFAULT for default file creation properties.

hid_t access_id

File access property list identifier. If parallel file access is desired, this is a collective call according to the communicator stored in the access_id. Use H5P_DEFAULT for default file access properties.

Returns:

Returns a file identifier if successful; otherwise returns a negative value.

Non-C API(s):


 

Issue 3

H5Fclose “promises” an opened file that still has opened objects will stay open even after H5Fclose has been called; and all open objects can be accessed, including write.  Then when all objects of the file are closed, the file is fully closed.  This is not always possible.  For example, an MPI-IO file close is a collective call.  Therefore all the processes that opens the file must close the file collectively.  The file can not be closed sometimes in the future by each process in an independent fashion.  Another example is that an application using AFS-token based file access privilage may destroy its AFS-token after H5Fclose has returned success.  Thus making any future access to the file illegal.

Solution

a) Need to revise H5Fclose that the access to object after H5Fclose may not always work.  Note that the second paragraph of the description is redundant and should have been removed.

b) Also need to revise the library code to really close the file for the example cases mentioned above and H5Fclose should return failure if not all objects of the file are closed. This is revision of the delay close feature is quite extensive and is treated in a separate proposal.

 

Name: H5Fclose

Signature:

herr_t H5Fclose(hid_t file_id )

Purpose:

Terminates access to an HDF5 file.

Description:

H5Fclose terminates access to an HDF5 file by flushing all data to storage and terminating access to the file through file_id.

If this is the last file identifier open for the file and no other access identifier is open (e.g., a dataset identifier, group identifier, or shared datatype identifier), the file will be fully closed and access will end.

Delay close: Note the following deviation from the above-described behavior. If H5Fclose is called for a file but one or more objects within the file remain open, those objects will remain accessible until they are individually closed. Thus, if the dataset data_sample is open when H5Fclose is called for the file containing it, data_sample will remain open and accessible (including writable) until it is explicitely closed. The file will be automatically closed once all objects in the file have been closed. Be warned that it is not always possible to do delay close.  For example, an MPI-IO file close is a collective call.  Therefore all the processes that opens the file must close the file collectively.  The file can not be closed sometimes in the future by each process in an independent fashion.  Another example is that an application using AFS-token based file access privilage may destroy its AFS-token after H5Fclose has returned success.  Thus making any future access to the file illegal. Applications should close all open objects of a file before calling H5Fclose.

Parameters:

hid_t file_id

Identifier of a file to terminate access to.

Returns:

Returns a non-negative value if successful; otherwise returns a negative value.

Non-C API(s):


Appendix A: Current v1.4.2 definitions of H5Fopen/H5Fcreate/H5Fclose

 

Name: H5Fopen

Signature:

hid_t H5Fopen(const char *name, unsigned flags, hid_t access_id )

Purpose:

Opens an existing file.

Description:

H5Fopen opens an existing file and is the primary function for accessing existing HDF5 files.

The parameter access_id is a file access property list identifier or H5P_DEFAULT for the default I/O access parameters.

The flags argument determines whether writing to an existing file will be allowed or not. The file is opened with read and write permission if flags is set to H5F_ACC_RDWR. All flags may be combined with the bit-wise OR operator (`|') to change the behavior of the file open call. The more complex behaviors of a file's access are controlled through the file-access property list.

Files which are opened more than once return a unique identifier for each H5Fopen() call and can be accessed through all file identifiers.

The return value is a file identifier for the open file and it should be closed by calling H5Fclose() when it is no longer needed.

Parameters:

const char *name

Name of the file to access.

unsigned flags

File access flags. Allowable values are:

H5F_ACC_RDWR

Allow read and write access to file.

H5F_ACC_RDONLY

Allow read-only access to file.

·  H5F_ACC_RDWR and H5F_ACC_RDONLY are mutually exclusive; use exactly one.

·  An additional flag, H5F_ACC_DEBUG, prints debug information. This flag is used only by HDF5 library developers; it is neither tested nor supported for use in applications.

hid_t access_id

Identifier for the file access properties list. If parallel file access is desired, this is a collective call according to the communicator stored in the access_id. Use H5P_DEFAULT for default file access properties.

Returns:

Returns a file identifier if successful; otherwise returns a negative value.

Non-C API(s):


Name: H5Fcreate

Signature:

hid_t H5Fcreate(const char *name, unsigned flags, hid_t create_id, hid_t access_id )

Purpose:

Creates HDF5 files.

Description:

H5Fcreate is the primary function for creating HDF5 files .

The flags parameter determines whether an existing file will be overwritten. All newly created files are opened for both reading and writing. All flags may be combined with the bit-wise OR operator (`|') to change the behavior of the H5Fcreate call.

The more complex behaviors of file creation and access are controlled through the file-creation and file-access property lists. The value of H5P_DEFAULT for a property list value indicates that the library should use the default values for the appropriate property list.

Parameters:

const char *name

Name of the file to access.

uintn flags

File access flags. Allowable values are:

H5F_ACC_TRUNC

Truncate file, if it already exists, erasing all data previously stored in the file.

H5F_ACC_EXCL

Fail if file already exists.

·  H5F_ACC_TRUNC and H5F_ACC_EXCL are mutually exclusive; use exactly one.

·  An additional flag, H5F_ACC_DEBUG, prints debug information. This flag is used only by HDF5 library developers; it is neither tested nor supported for use in applications.

hid_t create_id

File creation property list identifier, used when modifying default file meta-data. Use H5P_DEFAULT for default file creation properties.

hid_t access_id

File access property list identifier. If parallel file access is desired, this is a collective call according to the communicator stored in the access_id. Use H5P_DEFAULT for default file access properties.

Returns:

Returns a file identifier if successful; otherwise returns a negative value.

Non-C API(s):


Name: H5Fclose

Signature:

herr_t H5Fclose(hid_t file_id )

Purpose:

Terminates access to an HDF5 file.

Description:

H5Fclose terminates access to an HDF5 file by flushing all data to storage and terminating access to the file through file_id.

If this is the last file identifier open for the file and no other access identifier is open (e.g., a dataset identifier, group identifier, or shared datatype identifier), the file will be fully closed and access will end.

If this is the last file identifier open for the file and other access identifiers are still in use, those access identifiers remain valid until separately closed and can still be used. (But the file identifier is no longer valid and cannot be used.) Once all of the remaining access identifiers are closed, the file will be fully closed and access will end.

EXCEPTION: Note the following deviation from the above-described behavior. If H5Fclose is called for a file but one or more objects within the file remain open, those objects will remain accessible until they are individually closed. Thus, if the dataset data_sample is open when H5Fclose is called for the file containing it, data_sample will remain open and accessible (including writable) until it is explicitely closed. The file will be automatically closed once all objects in the file have been closed.

Parameters:

hid_t file_id

Identifier of a file to terminate access to.

Returns:

Returns a non-negative value if successful; otherwise returns a negative value.

Non-C API(s):


 

Last revision: Albert Cheng, October 26, 2001

Email: hdf5lib@ncsa.uiuc.edu