Return to “Advanced Topics.”
This document assumes a working familiarity with UTF-8 Unicode (UTF-8).
Any reader who is unfamiliar with UTF-8 encoding should read the
Wikipedia UTF-8 article
(https://en.wikipedia.org/wiki/UTF-8
)
before proceeding; it provides an excellent primer.
For our context, the most important UTF-8 concepts are:
More specific technical details will only become important if they affect the specifics of your application design or implementation.
H5Pset_char_encoding
,
which sets the character encoding used for object and attribute names.
For example, the following call sequence could be used to create a dataset with its name encoded with the UTF-8 character set:
lcpl_id = H5Pcreate(H5P_LINK_CREATE) ; error = H5Pset_char_encoding(lcpl_id, H5T_CSET_UTF8) ; dset_id = H5Dcreate2(group_id, "datos_ñ", dtype_id, dspace_id, lcpl_id, H5P_DEFAULT, H5P_DEFAULT) ;
If the character encoding of an object name is unknown,
the combination of an H5Dget_create_plist
call
and an H5Pget_char_encoding
call will reveal that
information.
H5Tset_cset
,
which sets the character encoding to be used in building a character
datatype.
For example, the following commands could be used to create an 8-character, UTF-8 encoded, string datatype for use in either an attribute or dataset:
utf8_8char_dtype_id = H5Tcopy(H5T_C_S1) ; error = H5Tset_cset(utf8_8char_dtype_id, H5T_CSET_UTF8) ; error = H5Tset_size(utf8_8char_dtype_id, "8") ;
If a character or string datatype’s character encoding is unkonwn,
an H5Tget_cset
call can be used to determine that.
For object and attibute names:
H5Pset_char_encoding
H5Pget_char_encoding
|
For dataset and attribute datatypes:
H5Tset_cset
H5Tget_cset
|
UTF-8 article on Wikipedia |
Return to “Advanced Topics.”
The HDF Group Help Desk: ![]() Describes HDF5 Release 1.8.12, November 2013. | Copyright by The HDF Group |