DATABASE

Type of Work

Research


Item

Integrate HDF5 with SQL server's CLR


Explanation

Common Language Runtime (CLR) is based on .NET runtime, and any language can be wrapped to execute assemblies (libraries). In this context, one can define functions, tables, and aggregates that make sense to scientists, in essence keeping all the data an programs in one integrated place. Applying this to HDF5, HDF5 objects would become part of the package, with HDF5 library functions being native to the CLR. A full integration would mean that the HDF5 would be a native C# application - that is, the HDF5 library would be implemented in C#. Because a C# implementation would be a very large undertaking, it is recommended that a prototype first be created that would wrap the HDF5 library with C#.

HDF5 data itself would reside primarily in HDF5 files, making them available for other uses just as they now are. However, when performance and other considerations dictated, some HDF5 data would also likely be brought into SQL server tables, and indexes would be created within SQL server for fast query and access to other HDF5 data. Applied Physics Lab (APL) has used this approach for astronomy data management and analysis, rewriting a large code base in C#.


Documentation

Microsoft trip report, Sept. 2006; also "HDF5 Integration in SQL Server 2005(+)," Gerd Heber, Cornell Theory Center, October, 2006 (https://support.hdfgroup.org/RFC/SQL-HDF5/)