Fastest and Most Memory Efficient Way to Generate Structure from Matrix of Data

8 views (last 30 days)
I have a matrix of numerical data: loads(ncases, ngridpoints) - where ncases relates to the number of independent variable (M, alpha and beta) permutations and ngridpoints relates to the number of points in the physical model. To make this concrete, consider the specific example. I have cases defined by the following independent variable permutations with rows sorted from order of M, alpha, beta:
M = 0.5, alpha = -2.5, beta = 0
M = 0.5, alpha = -2.5, beta = 5
M = 0.5, alpha = 0, beta = -5
M = 0.5, alpha = 0, beta = 0
M = 0.5, alpha = 0, beta = 5
M = 0.5, alpha = 2.5, beta = 0
M = 0.5, alpha = 2.5, beta = 5
M = 0.5, alpha = 5, beta = -5
M = 0.5, alpha = 5, beta = 0
M = 0.5, alpha = 5, beta = 5
This is 10 permutations, and if I have 5 grid points, then I have: loads(10, 5). This can also be seen as incomplete set of permutations of the "full" set of all permutations of the following independent variable ranges:
M = 0.5;
alpha = [-2.5; 0; 2.5; 5];
beta = [-5; 0; 5];
Where, if all the permutations were listed (12 cases) with rows sorted in the order of M, alpha and beta, then lines 1 and 7 are missing (i.e. the permutation cases M = 0.5, alpha = -2.5, beta = -5 and M = 0.5, alpha = 2.5, beta = -5 are missing).
I would like to generate a structure, in which the top hierarchy field is the gridpoint name, and then for each gridpoint, the loads are stored in a (M x alpha x beta) array (in this case 1 x 4 x 3). I also want to ensure that any missing data is filled using linear interpolation.
I have developed a way to do it, but feel that it is not the most efficient. Here it is:
%% Generate data to use
% Define gridpoint names
gridnames = ["g1"; "g2"; "g3"; "g4"; "g5"];
glength = length(gridnames);
% Define actual loads data
loads = rand(10, glength);
%% Generate matrices for full set of permutations and "actual" permutations
% Everything that follows till the end of this code is my method for generating the structure
% Define independent parameters
M = 0.5;
alpha = [-2.5; 0; 2.5; 5];
beta = [-5; 0; 5];
totalPerms = length(M)*length(alpha)*length(beta);
% Define full array of independent parameters
MCol = repmat(M, totalPerms/length(M), 1);
alphaCol = sort(repmat(alpha, totalPerms/length(alpha), 1));
betaCol = repmat(beta, totalPerms/length(beta), 1);
fullIndependentMat = [MCol alphaCol betaCol];
% Define actual array of independent parameters
MCol([1 7]) = [];
alphaCol([1 7]) = [];
betaCol([1 7]) = [];
actualIndependentMat = [MCol alphaCol betaCol];
%% Generate Structure
for j = 1:glength
% Initialize new container for loads data
arrayLOADS = nan(totalPerms, 1);
counter = 1;
for i = 1:totalPerms
if fullIndependentMat(i, 1) == actualIndependentMat(counter, 1) && ...
fullIndependentMat(i, 2) == actualIndependentMat(counter, 2) && ...
fullIndependentMat(i, 3) == actualIndependentMat(counter, 3)
arrayLOADS(i) = loads(counter, j);
counter = counter + 1;
end
end
arrayLOADS = fillmissing(reshape(arrayLOADS, [length(M), length(alpha), length(beta)]), 'linear');
finalLoads.(gridnames{j}).data = arrayLOADS;
end
I would like to end up with finalLoads.
You can see that loops are used to generate the structure. I wonder if there is a more efficient way. Possibly without using any loops and vectorising somehow instead?
Despite running on 2021a, if there is an even better way to do this on 2023a that isn't available on 21a, I would appreciate that method too.

Answers (2)

Jaynik
Jaynik on 4 Nov 2024
The current approach involves iterating through loops to match and fill the data, which can be optimized using MATLAB's vectorization capabilities. Here is a more efficient way to achieve the same result by using logical indexing and vectorized operations. We can also take advantage of multi-dimensional arrays and utilize functions like accumarray to manage the data more effectively.
gridnames = ["g1"; "g2"; "g3"; "g4"; "g5"];
glength = length(gridnames);
loads = rand(10, glength);
M = 0.5;
alpha = [-2.5; 0; 2.5; 5];
beta = [-5; 0; 5];
% Create a full grid of permutations
[alphaGrid, betaGrid] = ndgrid(alpha, beta);
fullIndependentMat = [repmat(M, numel(alphaGrid), 1), alphaGrid(:), betaGrid(:)];
% Define actual array of independent parameters
actualIndependentMat = [
M, -2.5, 0;
M, -2.5, 5;
M, 0, -5;
M, 0, 0;
M, 0, 5;
M, 2.5, 0;
M, 2.5, 5;
M, 5, -5;
M, 5, 0;
M, 5, 5
];
% Map actual permutations to indices
[~, loc] = ismember(actualIndependentMat, fullIndependentMat, 'rows');
% Convert linear indices to subscript indices for accumarray
[subAlpha, subBeta] = ind2sub([length(alpha), length(beta)], loc);
% Initialize the structure
finalLoads = struct();
% Generate Structure using accumarray
for j = 1:glength
% Use accumarray to place known loads into their respective positions
arrayLOADS = accumarray([ones(size(subAlpha)), subAlpha, subBeta], loads(:, j), [1, length(alpha), length(beta)], [], NaN);
% Interpolate missing values
arrayLOADS = fillmissing(arrayLOADS, 'linear');
% Assign to final structure
finalLoads.(gridnames{j}).data = arrayLOADS;
end
You can read more about these functions in the following documentation:
Hope this helps!

Arjun
Arjun on 4 Nov 2024
I see that you have a code for generating structures from matrix of data and you are looking for improvements to make it more efficient.
The code you have provided is working fine but there are couple of enhancements that can be made like using “ndgridfunction to generate full independent matrix and ismember” function for finding the indices instead of looping through all the permutations. These functions are much more optimized for performance than manually doing the same task.
Kindly refer to the modified code below for better understanding:
gridnames = ["g1"; "g2"; "g3"; "g4"; "g5"];
glength = length(gridnames);
loads = rand(10, glength);
% Define independent parameters
M = 0.5;
alpha = [-2.5; 0; 2.5; 5];
beta = [-5; 0; 5];
% Generate full set of permutations
[MGrid, AlphaGrid, BetaGrid] = ndgrid(M, alpha, beta);
fullIndependentMat = [MGrid(:), AlphaGrid(:), BetaGrid(:)];
% Define actual array of independent parameters
actualIndependentMat = [
0.5, -2.5, 0;
0.5, -2.5, 5;
0.5, 0, -5;
0.5, 0, 0;
0.5, 0, 5;
0.5, 2.5, 0;
0.5, 2.5, 5;
0.5, 5, -5;
0.5, 5, 0;
0.5, 5, 5
];
% Find indices of actual permutations in the full set
[~, idx] = ismember(actualIndependentMat, fullIndependentMat, 'rows');
% Initialize the structure
finalLoads = struct();
% Generate the structure without loops
for j = 1:glength
% Initialize new container for loads data
arrayLOADS = nan(size(fullIndependentMat, 1), 1);
% Assign the loads to the correct positions
arrayLOADS(idx) = loads(:, j);
% Reshape and fill missing values
reshapedLOADS = reshape(arrayLOADS, [length(M), length(alpha), length(beta)]);
filledLOADS = fillmissing(reshapedLOADS, 'linear', 2, 'EndValues', 'nearest');
% Store in the structure
finalLoads.(gridnames{j}).data = filledLOADS;
end
Kindly refer to the documentation of “ndgrid” and “ismember” function:
I hope this will help!

Categories

Find more on Resizing and Reshaping Matrices in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!