Errors while reading binary data files

12 views (last 30 days)
Ahmed Zankoor
Ahmed Zankoor on 21 Apr 2021
Commented: Ahmed Zankoor on 21 Apr 2021
I am trying to read binary files with uint32 data entries for example. The function reads the data correctly up to a certain elelment and then reads unexpected data (which I am sure do not exist in the original file). In some cases, these data blowup to very large values (for example, I was reading a uint32 data file with a maximum value of ~8000 and the maximum of data read is ~4.127*10^9. The code I am using is shown below (note: the asterisck does not have an effect, I repeated this with different files and checked the data using other programming languages/softwares):
function [X,Y,Z,Volume] = GetBin(filename,volumeSize,nOfBytes)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Reads a binary formatted file into a 3D MATLAB matrix.
%
% INPUT:
% filename: string, name of binary file for reading
%
% OUTPUT:
% X,Y,Z: integer, size of matrix in cartesian coordinates
% Volume: integer 3D matrix, voxel values (labels)
% Ahmed Zankoor , April 2021
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
file = "data\" + filename;
fid = fopen(file, 'rt');
if fid == -1
error('Cannot open file for reading: %s', file);
end
X = volumeSize(1);Y = volumeSize(2);Z = volumeSize(3);
% Read binary file, By default, fread reads a file 1,2 or 3 byte at a time,
% interprets one byte as an 8-bit unsigned integer (uint8),two byte as an 16-bit unsigned integer (uint16)
% three byte as an 32-bit unsigned integer (uint32).
if nOfBytes == 1
data = fread(fid,Inf,'*uint8');
data = uint8(data);
elseif nOfBytes == 2
data = fread(fid,Inf,'*uint16');
data = uint16(data);
elseif nOfBytes == 4
data = fread(fid,Inf,'*uint32');
data = uint32(data);
else
error('Unrecognized number of bytes per entry.')
end
fclose(fid);
if length(data)~= X*Y*Z
disp('Size of data does not match size of Volume.')
disp(['Size of data = ' num2str(length(data))])
disp(['Size of volume = ' num2str(X*Y*Z)])
end
Z = floor(length(data)/(X*Y));
data = data(1:X*Y*Z);
Volume = reshape(data,X,Y,Z);
end
For visualization, the image attached shows an example of the read data, where the top is correctly read data and then the mess below is because of the errors. I wonder if anyone knows why this may be happening?
Thank you.
  2 Comments
Ahmed Zankoor
Ahmed Zankoor on 21 Apr 2021
No, I tried with files written by C++ and others exported from a commercial software which I think is also written in C++. Same problem.

Sign in to comment.

Accepted Answer

Jan
Jan on 21 Apr 2021
Edited: Jan on 21 Apr 2021
The problem is hidden here:
fid = fopen(file, 'rt');
This opens the file in "text"-mode on Windows. Then e.g. a CHAR(8) is converted to a backspace, which means, that the former byte is deleted. ^Z is interpreted as end of file and there are a lot of further gimmicks. Therefore a file with arbitrary bytes can contain less characteres after the import than the files has bytes on the disk.
The solution is easy and makes the code unspecific for the platform it runs on: Open the file in binary mode by omitting the 't':
fid = fopen(file, 'r');
I prefer this for text files also, because the old DOS control characters are a common source of unexpected behaviour. The interpretation in text moder costs runtime also.
A hint: fread(fid, inf, '*uint8') replies an UINT8 already. So you can omit the lines:
data = uint8(data);
  1 Comment
Ahmed Zankoor
Ahmed Zankoor on 21 Apr 2021
This is absolutely right, all issues are fixed. Thank you so much.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!