Massive time required for pdist
4 views (last 30 days)
Show older comments
Sebastian Stumpf
on 4 Oct 2021
Commented: Sebastian Stumpf
on 6 Oct 2021
Hello,
I am using the Matlab function pdist to calculate the distance between two points. However, I noticed that the function needs a lot of time, despite it is using all four cores. I build this example to demonstrate the massive time comsumption. If I calculate the distance between two points with my own code, it is much faster. The example calculates the distance between a thousand points.
clear
close all
clc
tic
j=1;
X = rand(1000,2);
Y = rand(1000,2);
fprintf('Time for array creation: ');
toc
tic
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
j = j+1;
end
end
fprintf('Time for own distance calculation: ');
toc
j = 1;
tic
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];
B(j,1) = pdist(P,'euclidean');
j = j+1;
end
end
fprintf('Time for distance calculation using Matlab function pdist: ');
toc
Output:
Time for array creation: Elapsed time is 0.000386 seconds.
Time for own distance calculation: Elapsed time is 0.251026 seconds.
Time for distance calculation using Matlab function pdist: Elapsed time is 10.776532 seconds.
You can clearly see, that the Matlab function pdist takes over 10 seconds longer.
My question is: Why? What else is this function doing?
Would be nice to know.
Thank you very much
Kind regards,
Sebastian
0 Comments
Accepted Answer
Chunru
on 4 Oct 2021
Edited: Chunru
on 4 Oct 2021
%tic
X = rand(1000,2);
Y = rand(1000,2);
% fprintf('Time for array creation: ');
%toc
%% Version 1
tic
j=1;
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
j = j+1;
end
end
size(A)
t = toc;
fprintf('Time for own distance calculation: %.6f\n', t);
%% Version 1.1
% Pre-allocate A
tic
j=1;
A = inf(size(X,1)*size(Y,1), 1);
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
A(j,1) =sqrt((Y(i,1)-X(k,1))^2 + (Y(i,2)-X(k,2))^2);
j = j+1;
end
end
size(A)
t = toc;
fprintf('Time for own distance calculation with preallocation: %.6f\n', t);
%% Version 2
tic
j=1;
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];
B(j,1) = pdist(P,'euclidean'); % one pair
j = j+1;
end
end
size(B)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist: %.6f\n', t);
%% Version 2.1
% Pre-allocate B before hand
tic
j=1;
B = inf(size(X,1)*size(Y,1), 1);
for i = 1:1:size(Y,1)
for k = 1:1:size(X,1)
P = [Y(i,1),Y(i,2);X(k,1),X(k,2)];
B(j,1) = pdist(P,'euclidean');
j = j+1;
end
end
size(B)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist: %.6f\n', t);
%% Version 3
% pdist of many points (this compute distance x2-x1, x3-x1, ... x1000-x1,
% y1-x1, ..., y10001; x3-x2, ..., x1000-x2, ..., y1000-x2 etc
% doc pdist
tic
p = pdist([X; Y]); % dist
size(p)
t = toc;
fprintf('Time for distance calculation using Matlab function pdist (many points): %.6f\n', t);
More Answers (0)
See Also
Categories
Find more on Install Products in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!