Mean distance function upgrade question
4 views (last 30 days)
Show older comments
Dear Team,
The below code calculating the mean distance. For a few thousand points (x,y,z) the code is working fine, but when i input values as group1 = 70000 points and group2 = 80000 points the progress is too slow. What should i add/change in the below code to have optimal results ?
data = table2array(readtable("test.xlsx"));
group1 = length(data(~isnan(data(:,1))));
group2 = length(data(~isnan(data(:,5))));
tic
for i=1:group1
display(i);
minval = inf;
for j=1:group2
point(i,j) = sqrt((data(j,5)-data(i,1))^2+(data(j,6)-data(i,2))^2+(data(j,7)-data(i,3))^2);
if point(i,j)<minval
minval = point(i,j);
end
end
values(i) = minval;
end
avg = mean(values);
toc
Thanks in advance
0 Comments
Accepted Answer
More Answers (2)
Jan
on 31 Oct 2022
Edited: Jan
on 1 Nov 2022
data = table2array(readtable("test.xlsx"));
% group1 = length(data(~isnan(data(:,1)))); Faster:
group1 = nnz(~isnan(data(:,1)));
group2 = nnz(~isnan(data(:,5)));
tic
values = zeros(group1, 1); % Pre-allocate
for i = 1:group1
% Wastes time: display(i);
% Do you reall need the huge point(i,j) array? If not, collect the data
% in a scalar:
minval = inf;
for j = 1:group2
% Avoid the expensive SQRT at searching for the minimum:
point = (data(j,5)-data(i,1))^2 + ...
(data(j,6)-data(i,2))^2 + ...
(data(j,7)-data(i,3))^2;
if point < minval
minval = point;
end
end
values(i) = sqrt(minval); % One SQRT is enough
end
avg = mean(values);
toc
Vectorizing the inner loop is most likely faster:
point = (data(1:group2,5) - data(i,1))^2 + ...
(data(1:group2,6) - data(i,2))^2 + ...
(data(1:group2,7) - data(i,3))^2;
values(i) = sqrt(min(point)); % One SQRT is enough
Now avoid creating the submatrices repeatedly:
values = zeros(n, 1); % Pre-allocate!
A = data(:, 5:7);
B = data(:, 1:3);
for i = 1:n
point = sum((A - B(i, :)).^2, 2);
values(i) = sqrt(min(point)); % One SQRT is enough
end
avg = mean(values);
Compare this with the nice and clean PDIST method suggested by Torsten.
3 Comments
Jan
on 1 Nov 2022
Locally in my R2018b installation this is the fastest:
S = 0;
a5 = data(:, 5);
a6 = data(:, 6);
a7 = data(:, 7);
for i = 1:n % Faster with PARFOR!
p = (a5 - data(i, 1)).^2 + ...
(a6 - data(i, 2)).^2 + ...
(a7 - data(i, 3)).^2;
S = S + sqrt(min(p));
end
avg = S / n;
See Also
Categories
Find more on NaNs in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!