Why does converting a table to a struct increase memory usage by 15x??
7 views (last 30 days)
Show older comments
I'm reading tabular data from a file using the "readtable" function - each file has 54 fields and 1000 rows. It takes up 250 kB on disk, and 450 kB in memory as a table. Then, when I try to convert the table to a struct using the "table2struct" function, the resulting struct takes up 6.5 MB!!! Why does converting from a table to a struct result in a 15x increase in memory usage? I have several thousand of these files to manipulate, so 450 kB per file is fine, but 6.5 MB makes MATLAB run out of memory! No good.
Here's some output to verify my assertions:
>> t = readtable('example_file.dat');
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the
VariableDescriptions property.
>> t.Properties
ans =
struct with fields:
Description: ''
UserData: []
DimensionNames: {'Row' 'Variables'}
VariableNames: {1×54 cell}
VariableDescriptions: {1×54 cell}
VariableUnits: {}
RowNames: {}
>> size(t)
ans =
1000 54
>> ts = table2struct(t);
>> size(ts)
ans =
1000 1
>> whos t ts
Name Size Bytes Class Attributes
t 1000x54 457776 table
ts 1000x1 6483456 struct
Why does converting from table to struct waste so much memory, and how can I fix it?
Thanks in advance for any help!
PS: For some reason this form won't allow me to select a release - I'm using MATLAB R2017a.
2 Comments
Walter Roberson
on 28 Oct 2019
Table objects have one datatype stored per variable (and more for variables that are cell)
struct have one datatype stored per field per struct array element.
Steven Lord
on 28 Oct 2019
PS: For some reason this form won't allow me to select a release - I'm using MATLAB R2017a.
Select the product first, then the release dropdown should populate.
Answers (1)
per isakson
on 28 Oct 2019
Edited: per isakson
on 28 Oct 2019
What kind of structure do you expect? table2struct can create two kinds.
- struct array with one struct for each row of the table.
- scalar struct with each column of the table stored as one field value
Try
t = readtable('example_file.dat');
struct_scalar = table2struct( t, 'ToScalar', true );
struct_array = table2struct(t);
whos
which returns
Name Size Bytes Class Attributes
struct_array 12x1 15040 struct
struct_scalar 1x1 2720 struct
t 12x10 4068 table
where example_file.dat contains
f00 f01 f02 f03 f04 f05 f06 f07 f08 f09
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
0 1 2 3 4 5 6 7 8 9
I assume that you expected table2struct(t); to create a scalar structure
4 Comments
Peter Perkins
on 1 Nov 2019
It all depends on what you are doing.
If you are using a struct array (as opposed to a scalar struct each of whose fields is itself a vector), a table is a clear winner memory-wise. This makes a big difference as your data size gets larger.
Tables allow you to easily slice your data in two directions. A struct array lets you slice along the "array" dimensions, but not so easily along the "fields" dimension. And tables support operations like joins and sorting and unique and others that struct arrays don't. So syntactically, I think you will be happier with tables.
Performance wise, it depends on what you are doing. Subscripted assignment and reference for tables is usually the thing people flag, but those have been getting more performant in the last couple releases (and that will continue). Performance-wise, I think you want to go for ease of use and move away from tables only if you have real performance issues. And even then, it's usually possible to vectorize your code, or to "hoist" a few variables out of the table for a short scope in your code.
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!