Very slow loop trying to find any intersection
Show older comments
Hi Everyone,
I am trying to figure out if there is any intersection between a pair of observations in terms of partners that they have worked with. Jaccard_dyadic is the dyadic table in which the first two columns identify the observations (i.e. the pair that makes up the unique identifier). Then I am trying to fill row 'm' with the value 1, whenever both of the observations have worked with any of the same inventors (assignee_inventor is a matrix in which all of the observations are the rows, and inventors the columns, filled with a 1 whenever the observation of the corresponding row has worked with the inventor of the corresponding column). The complicated loop structure I have created below does exactly that - however, it is super slow. Any help of how to speed up this process would be much appreciated (I suspect that there is a much simpler way of doing this).
for i = 1:(find(jaccard_dyadic(:,1)==0, 1, 'first')-1)
for l=1:p(2)
if any(assignee_inventors(jaccard_dyadic(i,1),l)==assignee_inventors(jaccard_dyadic(i,2),l) && assignee_inventors(jaccard_dyadic(i,2),l)==1)
jaccard_dyadic(i,m)=1;
end
end
end
EDIT:
This is the whole code I am using. I have added some sample data. Given that the results are quite sparse, I hope that there are some instances of what I am looking for here. I haven't uploaded the way I want the output to be, but essentially it is just the last row of the jaccard_dyadic matrix (filled with zeros) that I want to take on the value 1 if there is any overlap as described above.
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%Any Same Inventors
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
load('jaccard_dyadic_test.mat')
B = readmatrix('Inventor_copy.csv');
assignees = B(:,1);
inventors = B(:,2);
assignee_inventors=zeros(max(unique(B(:,1))), max(unique(B(:,2))));
empty_dim = size(B);
%%
for i=1:empty_dim(1)
assignee_inventors(assignees(i),inventors(i))=1;
end
%%
% actual code for what I need
p = size(assignee_inventors);
m = find(all(jaccard_dyadic==0), 1, 'first');
for i = 1:(find(jaccard_dyadic(:,1)==0, 1, 'first')-1)
for l=1:p(2)
if any(assignee_inventors(jaccard_dyadic(i,1),l)==assignee_inventors(jaccard_dyadic(i,2),l) && assignee_inventors(jaccard_dyadic(i,2),l)==1)
jaccard_dyadic(i,m)=1;
end
end
end
fprintf('After Inventors ');toc
Answers (1)
A simplified version to get the overview:
dy = jaccard_dyadic;
in = assignee_inventors;
n = find(dy(:,1) == 0, 1, 'first') - 1;
for i = 1:n
for k = 1:p(2) % k is less confusing as l
if any(in(dy(i,1), k) == in(dy(i,2), k) && in(dy(i,2), k) == 1)
jaccard_dyadic(i, m) = 1;
end
end
end
What is m ? What is the purpose of the any()? For a scalar input you can omit the any() and write:
if in(dy(i,1), k) == in(dy(i,2), k) && in(dy(i,2), k) == 1
Isn't this the same as:
if in(dy(i,1), k) == 1 && in(dy(i,2), k) == 1
Which values can in contain? If it is only 0 or 1:
if in(dy(i,1), k) && in(dy(i,2), k)
Then your loop might be equivalent to:
jaccard_dyadic = assignee_inventors(dy(:, 1), 1:p(2)) & ...
assignee_inventors(dy(:, 2), 1:p(2));
Here I guess, that "m" is the inner loop counter. Maybe you need to add "==1" to both operands. replace the "1:p(2)" by a simple ":" if this matchs your needs.
3 Comments
John Kirk
on 5 Jun 2019
John Kirk
on 6 Jun 2019
After you have explained, that m is a constant, the inner loop can be omitted:
in = assignee_inventors; % Shorter names for nicer code
dy = jaccard_dyadic;
p = size(in);
m = find(all(dy==0), 1, 'first');
for i = 1:500
ja(i, m) = any(in(dy(i,1), :) & in(dy(i,2), :), 2);
end
The outer loop can be vectorized also:
ja(:, m) = any(in(dy(:, 1), :) & in(dy(:, 2), :), 2);
I'd prefer to test the code before posting. Therefore it is better to post some input data, e.g. created by rand.
"I get an error that the index in position 1 is invalid."
Please post a copy of the complete error message, not a rephrased version. Which index is meant? Which code did you try exactly? Post it, because it might contain a typo. Maybe your jaccard_dyadic has more elements than assignee_inventors and some elements are zero. You can check this easily.
Categories
Find more on Loops and Conditional Statements in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!