Use Cases for the HDF5 Standalone Tests

Elena Pourmal

March, 2005

This document describes the motivation for creating a standalone test suite (or test suites), discusses the requirements for the test suite and gives examples of use cases. The use cases are based on the requests received by HDF Help Desk during the past few years.

It will be obvious to the reader that all use cases cannot be covered by one test suite, but it will be also obvious that there will be a lot of duplicated effort if we decide to address most of the use cases discussed below.

The goal of the document is to summarize all requests and to stir a discussion on what would be the best approach to address the problem.

Design of the test suite and implementation details are subjects for another document.

Motivation

The HDF5 C, Fortran , C++ and High-Level (HL) libraries come with the comprehensive test suites. Test suites are part of the HDF5 source distribution. Most of the tests are configured, built and run when C, Fortran, C++, and HL HDF5 libraries are built. After HDF5 libraries are installed, several examples that are also part of the HDF5 source distribution can be run to verify the correctness of the installation.

HDF5 source contains a few HDF5 C performance programs such as h5perf, perf_meta and benchpar parallel benchmarks, and chunk and iopipe sequential benchmarks in the perform directory of the top HDF5 directory. Those performance tests do not run automatically when HDF5 sequential and parallel libraries are tested. Also, there is not an easy way to store the reported benchmarks results and use them to compare their performances on the different platforms or to compare performances for the different versions of the HDF5 library.

Only the HDF5 C, C++ and Fortran examples are available with the installed HDF5 libraries. Performance programs are not installed, and users of the libraries usually do not have access to them. HDF5 examples are too trivial to verify the correctness of the libraries. Also they are not designed to be used as benchmarks.

While the provided HDF5 tests suites serve well during the HDF5 libraries development process, they cannot be used for verification, integration and acceptance processes that are part of the software installation in many computational laboratories and research centers such as NASA DAACs, Government Labs, and industrial companies. The HDF group provides pre-built HDF5 libraries and tools. In most of the institutions where HDF5 is used in “production mode,” system administrators and support staff have to rebuild the libraries and rerun the HDF5 tests suites that come with the source to assure the correctness of the libraries instead of using pre-built binaries (even those binaries are often created on the same system).

Existing HDF5 benchmarks focus only on comparing I/O performance between different layers such as system I/O, MPI I/O and HDF5 I/O layers. They cannot be utilized by the HDF5 library developers and the HDF5 users to analyze performances of sequential and parallel implementations of the same application or to compare the performances of applications that use the same HDF5 objects but with different creation and/or access properties.

Neither HDF5 tests suites nor HDF5 benchmarks can be easily used to check performance improvements or degradation between the versions of the HDF5 libraries. They also cannot be used to check backward/forward compatibility. Often, the tests and benchmarks are too difficult to understand and therefore they cannot serve as examples or templates for user’s application programs.

To summarize: current HDF5 tests suites and benchmarks do not satisfy the requirements which have emerged over the last few years. Those requirements are summarized in the next section.

Requirements for the new test suite

We need to provide

Sequential and parallel test suites to perform

HDF5 performance regression testing
HDF5 backward/forward compatibility regression testing

Sequential and parallel benchmarks to compare “HDF5 library to itself”, i.e. to compare performances of applications that use the same HDF5 objects but with the different properties
Sequential and parallel test suites to verify the installation of the pre-build libraries and tools
Sequential and parallel test suites to verify correctness of the major HDF5 features implemented in the pre-build libraries and tools
Sequential and parallel test suites that can serve as “real” examples or applications templates for the HDF5 users
What else?

Test suites, benchmarks and examples should also satisfy the following “implementation requirements”:

Come in a form of standalone test suites/benchmarks/examples (i.e. all programs have to be built against the pre-installed HDF5 libraries)
Can be easily built as an individual program, or the whole set of programs on all HDF5 supported platforms and with all supported compilers
Be easy to understand by HDF5 support staff and HDF5 users, and be well documented
Can be easily modified when new feature is requested
Implemented in C (F90 and C++ later) (what about Java?)
Provide a standard way to measure performance
Timing results from the benchmarks need to be saved and used in performance regression testing to discover changes in the performance
All programs should be managed with CVS (Note: Quincey created subversion repository http://sleipnir.ncsa.uiuc.edu/svn/repos/hdf5test to manage the test suite and evaluate subversion software)
What else?

Requirements 1 – 6 from the first list fall in the following three categories:

Performance (1 and 2)
Library maintenance and support (3 and 4)
User support (1,2 and 5)

It is not easy to address the requirements in one test suite since three different goals are pursued. Also performance testing (1 and 2) is the most urgent and critical task for the successful HDF5 library development and support efforts. It has to be implemented first.

With careful design and deployment of the performance regression tests that satisfy “implementation requirements,” it will be not too difficult to address requirements 3 -5.

Use cases

This section discusses the use cases for standalone test suites, i.e. it focuses on how the test suite will be used; there is no intention to discuss what exactly will be tested.

1. Sequential HDF5 performance regression testing

a. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 performance does not degrade during the development process, i.e. bug fixes, new features added to the library, etc. do not affect common raw data writing/reading operations, for example

i. writing/reading contiguous dataset(s) of

1. atomic types

2. compound types (by struct or by member of the struct)

3. variable-length types

4. etc. types TBD

ii. writing/reading contiguous dataset(s) by hyperslabs

1. using regular hyperslabs

2. using hyperslabs that are results of the set operations on the hyperslabs

3. etc. TBD

iii. writing/reading chunked dataset(s)

1. etc. TBD

iv. writing/reading compressed dataset(s)

1. GZIP compression

2. SZIP compression

3. Combinations with other filters

4. etc. TBD

v. TBD

b. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 performance does not degrade during the development process, i.e. bug fixes, new features added to the library, etc. do not affect common metadata writing/reading operations, for example

i. Creation/access/deletion of multiple objects in an HDF5 files (say 10^6)

ii. Creation/access/deletion of attributes (a lot of variations here)

iii. Creation/access of flat files (all objects in one group)

iv. Creation/access of nested files (deep tree structure)

v. Creation/access of nested files (shallow tree structure)

c. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 performance does not degrade during the development process, i.e. bug fixes, new features added to the library, etc. do not affect a and b using different file drivers (especially family and split driver)

d. TBD

2. Parallel HDF5 performance regression testing

a. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 parallel performance does not degrade during the development process, i.e. bug fixes, new features added to the library, etc. do not affect common operation described in 1 (a and b) implemented (when it is applicable) using parallel HDF5 library, for example

i. Writing/reading a contiguous dataset in parallel

1. Using collective I/O

2. Using independent I/O

ii. Writing/reading chunked dataset in parallel

1. Using collective I/O

2. Using independent I/O

iii. Writing/reading contiguous and chunked datasets by “simple” hyperslabs (aligned with a chunk or contiguous set of chunks)

iv. Writing/reading contiguous and chunked datasets by “complex” hyperslabs (hyperslab is an a result from set operations, or not aligned nicely with the contiguous chunks)

v. Creation/access/deletion of multiple objects in the file

3. Sequential and parallel HDF5 regression backward/forward compatibility

a. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 sequential library is backward/forward compatible (TBD), for example

i. Create a file with a chunked dataset with multiple attributes using HDF5 sequential library version 1.6.3 and read it with the HDF5 sequential library version 1.4.5-post9

ii. Create a file with multiple objects using HDF5 sequential library version 1.6.3 and read it with the HDF5 sequential library version 1.7.26-snap1

b. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 parallel library is backward/forward compatible (TBD), for example

i. Similar to 3.a (i, ii) but implemented in parallel

c. We would like to check on a regular basis (say once a week or with each successful snapshot) that HDF5 sequential/parallel libraries are backward/forward compatible (TBD), for example

i. Create a file with a chunked dataset with multiple attributes using HDF5 sequential library version 1.6.3 and read it with the HDF5 parallel library version 1.4.5-post9

4. Sequential benchmarks to compare “HDF5 library to itself”, i.e. to compare performances of applications that use the same HDF5 objects but with the different properties

a. We would like to answer user’s question “How much performance and file space overhead I will get if I use chunking storage vs. contiguous storage?”

b. We would like to answer user’s question “Is it better to create and access one dataset of compound datatype with N fields, or N datasets with atomic filed?”

c. We would like to answer user’s question “Is it better to use variable-length datatypes or fixed length of an appropriate length plus compression?”

Note: It is clear that benchmarks needed in 4 can be successfully used in 1

5. Parallel benchmarks to compare “HDF5 library to itself”, i.e. to compare performances of applications that use the same HDF5 objects but with the different properties

a. We would like to answer user’s question “How much performance and file space overhead I will get if I use chunking storage vs. contiguous storage in my parallel application?”

b. We would like to answer user’s question “Is it better to create and access one dataset of compound datatype with N fields, or N datasets with atomic filed in my parallel application?”

c. HDF Help desk receives email from a user. User complains about poor performance on his system when his application uses extendible datasets. We point the user to the benchmark in our standalone test suite that does exactly what he is complaining about (or we can slightly modify it according to the request). User then runs our benchmark and compares the timing result with the timing result from his application to confirm his theory.

Note: It is clear that benchmarks needed in 5 can be successfully used in 2

6. Benchmarks to compare HDF5 parallel library with HDF5 sequential library

a. We would like to make sure that parallel HDF5 is scalable on Altix SGI system, for example

i. Writing/reading contiguous dataset in parallel is faster (or not) than writing contiguous is sequential mode

ii. Writing/reading multiple datasets in parallel is faster (or not) than writing multiple datasets in sequential mode

b. We would like sure that files created by parallel library can be accessed/modified with sequential library and vice versa.

7. Sequential and parallel test suites to verify the installation of the pre-build libraries and tools

a. We are asked by NERSC stuff to help them to install HDF5-1.6.4 on their systems. We take our binaries, ask stuff to install them, and then we or they run a standalone test suite to verify that NERSC system has “correct or compatible, or does it?” versions of compression libraries, MPI I/O libraries, that permissions are set up correctly, etc.

b. User tries to build HDF5-1.6.4 on Solaris 2.9; we have only Solaris 2.8 binaries. We give him Solaris 2.8 binaries for installation along with the standalone test suite that user runs to verify he got working binaries

8. Sequential and parallel test suites to verify correctness of the major HDF5 features implemented in the pre-built libraries and tools

a. HDF Help desk receives a question “My program uses chunking, compression and Memory Core driver, it seems like that Memory Core driver is broken in the new release.” We will point a user to the Memory Core driver test program in our standalone test suite that he can run and report the result. User can also easily modify the program to make it as close to his application as possible and send to the HDF Help desk to report the problem.

b. LLNL installed new binaries for HDF5 libraries. They would like to know if the bug the saw with HDF5 collective write call was gone. They run several modified programs from the standalone test suite to verify the fix.

c. User is evaluating HDF5 library and he wants to be sure that the HDF5 library does what he needs to be done. We point him to the several programs from the standalone test suite that he can use as templates for his application.

9. Sequential and parallel test suites that can serve as “real” examples or applications templates for the HDF5 users

a. HDF Help desk receives a question: “I am new to HDF5 and has been studying your Tutorial. Tutorial example with compression works on my system. I used your example as a template for my application but I also need hypeslabs and compound datatypes. What do I need to use?”

b. User doesn’t use HDF5 at all, but someone gave him an HDF5 file with data he needs to read in his Fortran application. He asks us to help him to get the data from the file. We can point him to the simple HDF5 reading program and tell him how to integrate it with his application.