Possible bug in H5D.write, truncation of VLEN strings
2 views (last 30 days)
Show older comments
Hello,
I have discovered a potential bug, or at least some flaky behavior when using the low level HDF5 write function. When I try to write a long string as a variable length string, it seems to get truncated at 512 bytes (511 + the terminating null). I can write it just fine as a fixed length string.
The minimal script below reproduces the error. I see this on R2012a on both Linux and Mac. Am I missing a parameter or function call that controls the VLEN buffer size, or is something improperly hard coded in the underlying mex function?
Cheers, Souheil
-------------
% Create a long string
str = repmat('Hello from matlab. ',[1 1000]);
fprintf('Size of string = %d\n',length(str));
% Create an HDF5 file
filename = 'vlen_string_bug.h5';
fid = H5F.create(filename,'H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
% Write to a dataset as a variable length string
VLstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(VLstr_type,'H5T_VARIABLE');
space = H5S.create_simple(1, 1, []);
dset = H5D.create(fid, 'VLstr', VLstr_type, space, 'H5P_DEFAULT');
fprintf('Size of VLEN_BUF before = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5D.write(dset, VLstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', {str});
fprintf('Size of VLEN_BUF after = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5T.close(VLstr_type);
H5S.close(space);
H5D.close(dset);
% Write to a dataset as a fixed length string
Fstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(Fstr_type, length(str));
space = H5S.create_simple (1, 1, []);
dset = H5D.create (fid, 'Fstr', Fstr_type, space, 'H5P_DEFAULT');
H5D.write(dset, Fstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', str);
H5T.close(Fstr_type);
H5S.close(space);
H5D.close(dset);
% Close the file
H5F.close(fid);
% Read the strings back in using the high level read function
t = h5read(filename,'/VLstr');
vlstr = t{1};
fprintf('Size of VLEN string on disk = %d\n',length(vlstr));
t = h5read(filename,'/Fstr');
fstr = t{1};
fprintf('Size of fixed string on disk = %d\n',length(fstr));
0 Comments
Accepted Answer
More Answers (1)
per isakson
on 15 Sep 2012
Edited: per isakson
on 15 Sep 2012
However, HDF5 User's Guide, page 228, says:
[...] a length and data buffer must be allocated.
I don't see how.
This is not much of an answer. However, could it be that 512 is a default value that needs to be replaced by an appropriate value.
4 Comments
Oleg Komarov
on 15 Sep 2012
Edited: Oleg Komarov
on 15 Sep 2012
I found a description on the fields for H5F.get_mdc_config on http://www.hdfgroup.org/HDF5/doc/RM/RM_H5F.html#File-SetMdcConfig and maybe the properties set_initial_size and initial_size are relevant to the buffer.
However, I am unsure where to set those properties, at the File, dataset or property list level (H5F, H5D, H5P)...
I think it would be faster if you submitted a technical support request to TMW or to the HDFgroup.
Post any solution here (I am curious as well).
per isakson
on 16 Sep 2012
Edited: per isakson
on 16 Sep 2012
Here is a link to hdf-forum. A few Matlab related questions have been answered there. I cannot really contribute.
See Also
Categories
Find more on Entering Commands in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!