What is the best way to implement thousands of data in multiple matrices ?
Show older comments
Hello,
I have a folder with 10.000 .mat files, each of them is a structure with 8 different variables (one value each).
Each of these files is named accordingly to the row and column it is corresponding to in a 330x300 matrix (eg 123_25 is the file with the values of the 123th row and 25th column).
I have 8 matrices 330x300 in size, one for each variable and filled with NaNs. I would like to know which technique is the best to insert the values of my .mat files into these matrices ?
So far I tried 2 methods:
- A double-loop with j and k being respectively the row and column indexes (for every j, k loops from 1 to 300). Everytime k changes, the corresponding .mat file is loaded and its values are inserted in the corresponding matrices. The 8 matrices are loaded BEFORE the loops.
%% Example
load the 8 matrices
for j = 1:330
for k = 1:300
load([j '_' k '.mat']);
matrix_1(j,k) = j_k.variable1;
matrix_2(j,k) = j_k.variable2;
matrix_3(j,k) = j_k.variable3;
etc . . .
end
end
- The same double-loop but instead of loading the 8 matrices, I use the "matfile()" function and replace only the "j" and "k" indexed NaN with the corresponding .mat file's values.. As before, the corresponding .mat file is loaded at every iteration.
%% Example
% I create n.matfile() for each matrix
for i = 1:numel(list_names)
save('-v7.3',[temp_folder list_names{i} '.mat'], list_names{i});
m.(list_names{i}) = matfile(list_names{i},'Writable',true);
end
%%% list_names is a list with the names of the variables. In this example the list would go from matrix_1 to matrix_8
for j = 1:330
for k = 1:300
load([j '_' k '.mat']);
m.matrix_1(j,k) = j_k.variable1;
m.matrix_2(j,k) = j_k.variable2;
etc . . .
end
end
By going into the matlfile documentation I read that the most efficient way to deal with it would be to load everything at once in the memory and do all the replacements. Please note than the 10.000 files altogether are never heavier than 250Mb, and my pc has 16Gb of RAM.
I would like to try another method which would be:
Loading all the .mat files in the memory, loading the 8 matrices, inserting all the values with a loop without loading the files at every iteration. However I face a difficulty which is that my .mat files may have a different name, but they are all constructed the same way. So when I load a file and have it in the workspace, if I load another file it replaces the previous one, hence I can not load 2 files at the same time. Is there a way to load these files altogether at once even though they are built the same way, or is there a way to create dynamic names for variables (I know it is a bad idea) so I can load more than 1 file at a time ?
Finally, which method would be the fastest ? Maybe there is another one I didn't think of ?
I hope I was clear in my explanations, if not I apologize and I will try to explain again as clearly as possible.
Have a good day and thank you !
Accepted Answer
More Answers (1)
Steven Lord
on 18 May 2020
Rather than creating 8 individual variables why not create a 3-dimensional array of size [330 300 8]?
Z = NaN(6, 5, 4);
for pages = 1:4
for columns = 1:5
for rows = 1:6
Z(rows, columns, pages) = (rows*pages)+(columns^(pages-1));
end
end
end
Z(4, 2, 3) % 4*3 + 2^2 = 16
Although with the way your data is ordered, you'd want pages to be the innermost loop. That way you can load your data as soon as rows and columns are defined and iterate through the loaded data in the pages loop, filling in the appropriate elements in Z at each iteration.
Categories
Find more on Logical in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
