Asked by Raouf Amara
on 26 May 2019

I have two double arrays A (size: Nx1) and B (Mx1) both with large increasing values (from ~1e9 to ~1e13; not sure this is relevant).

I am trying to compute the pairwise differences between the two vectors but only for pairs that respect a certain condition, that is the absolute value of the difference between the two values is smaller than t.

I first tried to brute force calculate all the pairwise differences but as you would expect a NxM array cannot be stored on a regular computer; it is around a few hundred of terabytes.

So far I have implemented a rather straightforward solution which works fine with small N and M.

Low = 1e9; % lower bound for values in A & B

Upp = 1e13; % upper bound for values in A & B

N = 1e4; % number of elements in A and B (assuming they are equal, which is not the case)

A = sort((Upp-Low)*rand(N,1)+Low,'ascend');

B = sort((Upp-Low)*rand(N,1)+Low,'ascend');

maxdif = 1000000; % maximum difference

PrWD = cell(1,N);

tic

for i = 1:N

% pairwise difference if |B-A(i)|<=maxdif

PrWD{i} = A(i) - B(abs(B-A(i))<=maxdif)';

end

toc

However, I have A and B arrays which are both very large (N and M is the order of a few tens of millions) such that this operation takes a few hours in practice.

I tried arrayfun or bsxfun on small subsets (N=M=1e5) but it does not speed up the process.

At this point, I understand that this is extremely demanding as calculating all the pairwise differences or even the indices verifying the condition seem to scale with N*M.

Is there any (Matlab) way to speed up this process? Maybe taking into acccount the fact that values are increasing, or that not all differences need to be calculated?

Answer by Jan
on 8 Jun 2019

Accepted Answer

Instead of creating the huge index matrix abs(B-A(i))<=maxdif in each iteration, it is cheaper to search the lower and upper limits of the matching elements iteratively:

tic

PrWD2 = cell(1,N);

ini = 1;

fin = ini;

for i = 1:N

% Find initial and final index:

while B(ini) < A(i) - maxdif && ini < N

ini = ini + 1;

end

while B(fin) <= A(i) + maxdif && fin < N

fin = fin + 1;

end

if ini <= fin - 1

PrWD2{i} = A(i) - B(ini:fin - 1).';

end

end

toc

% PrWD2(cellfun('isempty', PrWD2)) = {zeros(1,0)};

% isequal(PrWD, PrWD2)

With your example and N=1e5 this is about 1100 times faster than the original code.

This code replies the empty matrix [], if no element is matching, while the original code creates a [1 x 0] matrix. For a comparison of the results, the empty matrices are adjusted, but this is not needed in the productive code most likely.

Sign in to comment.

Opportunities for recent engineering grads.

Apply Today
## 1 Comment

## Jan (view profile)

Direct link to this comment:https://au.mathworks.com/matlabcentral/answers/464048-how-to-speed-up-pairwise-difference-calculation-between-vectors-under-certain-condition#comment_711777

Sign in to comment.