save function significantly larger than variables in the file

I'm using R2018b running on a linux (CentOS 7) machine.
I have a structure X that is about 400MB in size inside matlab (according to the size listed in "Workspace"). If I save that variable to a .mat file, e.g.:
> save('MyStructure.mat','X')
The resulting .mat file is about 1.3Gb in size, or 3X larger than the size of the variable it is saving. Looking at my preferences, the default save format is V7.3.
However, if I save the same variable using the V7 format
> save('MyStructure.V7.mat','X','-v7')
The resulting .mat file is now only about 12MB, indicating compressed the data by about 30X when saving the file.
Best I can tell from reading documentation and help forums, both V7 and V7.3 save files using compression. However, why would I be getting a file size expansion of 3X when using the V7.3 format?

4 Comments

"Best I can tell from reading documentation and help forums, both V7 and V7.3 save files using compression."
True, but they use different compression methods (so getting different filesizes is entirely possible). V7.3 uses HDF5, the method used for V7 does not appear to be documented:
Would it make sense that the HDF5 schema results in a 3X increase in file size? That is, the HDF5 based .mat file is 3X larger than the number of bytes that the variable uses when it is in the matlab workspace?
Hi Peter, I have the same issue. It doesn't make much sense to me, and I also couldn't find good documentation to overcome this issue.
2 reasons I suspect are below, I need to analyze/experiment further before I can tell you I solved it:
1) Some data types are not very efficient when saving. Maybe using different date types could help. Instead of tables, using arrays might help.
2) .mat files might become corrupted, which could happen when saving the file to a network drive with bad connection.
If you are having issues with -v7.3, you might want to try -v7. I did a quick test. For the same variables and workspace (tested both), -v7 .mat files are significantly smaller than -v7.3 .mat files.

Sign in to comment.

Answers (1)

The size of the mat file is directly related to the size of the variables you are saving in it, plus some overhead for the structure. Since, in your case, you are saving a structure it adds on in the additional size of the file (when using -v7.3)
Please refer the matfile versions page for more information on this.

1 Comment

Ankit. That you for the response. What doesn't make sense to me is that I would expect two things:
1) The overhead needed to save the structure adds something small (perhaps 10% to 20%) to the file size. So in my case, a 400Mb structure perhaps requires 500Mb to store. However, I'm seeing a 200% increase in the structure size, with a 1,300Mb .mat file.
2) The compression used in V7.3 mat files would reduce the file size significantly from that.

Sign in to comment.

Categories

Find more on MATLAB in Help Center and File Exchange

Products

Release

R2018b

Asked:

on 10 Feb 2019

Commented:

on 24 Nov 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!