How do I get largest distance between two rows in an nx3 matrix?

11 views (last 30 days)
The rows in the matrix are x, y and z coordinates, and I want to find the maximum possible distance between two rows in the matrix. I was able to find the largest distances between columns in each row individually, but not the row with the largest sum and the row with the smallest sum.

Answers (4)

Walter Roberson
Walter Roberson on 5 Oct 2023
pdist -- possibly with a non-default metric.

Torsten
Torsten on 5 Oct 2023
Edited: Torsten on 5 Oct 2023
max(pdist(X),[],"all")
where X is your matrix.

John D'Errico
John D'Errico on 5 Oct 2023
As long as there are not too many points, pdist2 will work, and be easy to use. Just choose the largest of all distances found.
help pdist2
PDIST2 Pairwise distance between two sets of observations. D = PDIST2(X,Y) returns a matrix D containing the Euclidean distances between each pair of observations in the MX-by-N data matrix X and MY-by-N data matrix Y. Rows of X and Y correspond to observations, and columns correspond to variables. D is an MX-by-MY matrix, with the (I,J) entry equal to distance between observation I in X and observation J in Y. D = PDIST2(X,Y,DISTANCE) computes D using DISTANCE. Choices are: 'euclidean' - Euclidean distance (default) 'squaredeuclidean' - Squared Euclidean distance 'seuclidean' - Standardized Euclidean distance. Each coordinate difference between rows in X and Y is scaled by dividing by the corresponding element of the standard deviation computed from X, S=NANSTD(X). To specify another value for S, use D = PDIST2(X,Y,'seuclidean',S). 'fasteuclidean' - Euclidean distance computed by using an alternative algorithm that saves time. This faster algorithm can, in some cases, reduce accuracy. 'fastsquaredeuclidean' - Squared Euclidean distance computed by using an alternative algorithm that saves time. This faster algorithm can, in some cases, reduce accuracy. 'fastseuclidean' - Standardized Euclidean distance computed by using an alternative algorithm that saves time. This faster algorithm can, in some cases, reduce accuracy. 'cityblock' - City Block distance 'minkowski' - Minkowski distance. The default exponent is 2. To specify a different exponent, use D = PDIST2(X,Y,'minkowski',P), where the exponent P is a scalar positive value. 'chebychev' - Chebychev distance (maximum coordinate difference) 'mahalanobis' - Mahalanobis distance, using the sample covariance of X as computed by NANCOV. To compute the distance with a different covariance, use D = PDIST2(X,Y,'mahalanobis',C), where the matrix C is symmetric and positive definite. 'cosine' - One minus the cosine of the included angle between observations (treated as vectors) 'correlation' - One minus the sample linear correlation between observations (treated as sequences of values). 'spearman' - One minus the sample Spearman's rank correlation between observations (treated as sequences of values) 'hamming' - Hamming distance, percentage of coordinates that differ 'jaccard' - One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ function - A distance function specified using @, for example @DISTFUN A distance function must be of the form function D2 = DISTFUN(ZI,ZJ), taking as arguments a 1-by-N vector ZI containing a single observation from X or Y, an M2-by-N matrix ZJ containing multiple observations from X or Y, and returning an M2-by-1 vector of distances D2, whose Jth element is the distance between the observations ZI and ZJ(J,:). For built-in distance metrics, the distance between observation I in X and observation J in Y will be NaN if observation I in X or observation J in Y contains NaNs. D = PDIST2(X,Y,DISTANCE,'Smallest',K) returns a K-by-MY matrix D containing the K smallest pairwise distances to observations in X for each observation in Y. PDIST2 sorts the distances in each column of D in ascending order. D = PDIST2(X,Y,DISTANCE, 'Largest',K) returns the K largest pairwise distances sorted in descending order. If K is greater than MX, PDIST2 returns an MX-by-MY distance matrix. For each observation in Y, PDIST2 finds the K smallest or largest distances by computing and comparing the distance values to all the observations in X. [D,I] = PDIST2(X,Y,DISTANCE,'Smallest',K) returns a K-by-MY matrix I containing indices of the observations in X corresponding to the K smallest pairwise distances in D. [D,I] = PDIST2(X,Y,DISTANCE, 'Largest',K) returns indices corresponding to the K largest pairwise distances. D = PDIST2(X,Y,DISTANCE,'CacheSize',CACHESIZE) uses an intermediate matrix stored in cache to compute D, when 'Distance' is one of {'fasteuclidean','fastsquaredeuclidean','fastseuclidean'}. 'CacheSize' can be a positive scalar or 'maximal'. The default is 1e3. If numeric, 'CacheSize' specifies the cache size in megabytes (MB) to allocate for an intermediate matrix. If 'maximal', pdist2 attempts to allocate enough memory for an entire intermediate matrix whose size is MX-by-MY (MX is the number of rows of the input data X, and MY is the number of rows of the input data Y). 'CacheSize' does not have to be large enough for an entire intermediate matrix, but it must be at least large enough to hold an MX-by-1 vector. Otherwise, the regular algorithm of computing Euclidean distance will be used instead. If the specified cache size exceeds the available memory, MATLAB issues an out-of-memory error. Example: % Compute the ordinary Euclidean distance X = randn(100, 5); Y = randn(25, 5); D = pdist2(X,Y,'euclidean'); % euclidean distance % Compute the Euclidean distance with each coordinate difference % scaled by the standard deviation Dstd = pdist2(X,Y,'seuclidean'); % Use a function handle to compute a distance that weights each % coordinate contribution differently. Wgts = [.1 .3 .3 .2 .1]; % coordinate weights weuc = @(XI,XJ,W)(sqrt((XI - XJ).^2 * W')); Dwgt = pdist2(X,Y, @(Xi,Xj) weuc(Xi,Xj,Wgts)); See also PDIST, KNNSEARCH, CREATENS, KDTreeSearcher, ExhaustiveSearcher. Documentation for pdist2 doc pdist2 Other uses of pdist2 gpuArray/pdist2 tall/pdist2

Bruno Luong
Bruno Luong on 5 Oct 2023
Edited: Bruno Luong on 5 Oct 2023
For euclidian distance, the largest distance must contain 2 points on the convex hull
xy = randn(10000,2);
K = convhull(xy);
n = size(K,1);
dmax = -Inf;
for i=2:n
j = 1:i-1;
dxy = xy(K(1:i-1),:)-xy(K(i),:);
dxy2 = sum(dxy.^2,2);
[dmaxi, k] = max(dxy2);
if dmaxi > dmax
dmax = dmaxi;
imax = i;
jmax = k;
end
end
imax = K(imax);
jmax = K(jmax);
figure
hold on
plot(xy(:,1), xy(:,2), '.');
plot(xy([imax jmax],1), xy([imax jmax],2), '-or');
axis equal
I wonder if such characterization also apply for p-norm distance. If someone can come up with a proof or counter example I'll glad to learn.

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!