Finding the length of the longest continuous streak of a same number for each row of a matrix

5 views (last 30 days)
Let A be a (possibly 3D) matrix of type uint16 and size [p,q,r] = size(A), whose rows are all sorted in an ascending way. I want to find for each row A(i,:,k) the length of the longest continuous streak of same number in it. For example, if
A = [1 2 3 4 4 4;
1 1 2 2 2 2];
is such a matrix, then my output should read
c = [3;4]
While q is small, i.e. , the number of rows p can be quite big, e.g. , and r should be reasonable (to be determined). The latter means that, if possible, I would like to simultaneously perform the routine in question on several 2D matrices of the same size stacked as a 3D matrix rather than loop through them one by one, provided the procedure allows for that and there is a performance benefit to it.
Here are 3 methods I've given some thought.
If a is just a row, then
% Method 1: works only for vectors, don't know how to vectorize it to A
b = max(diff(find([1,diff(a)~=0,1])));
does it, but I don't know how to generalize this to a 2D or 3D matrix A. The main issue is that find() only gives the linear indices, so one has to somehow compute max(diff()) , so to speak. So, instead, for a 3D matrix A I have the following two routines:
% Input:
p = 5000;
q = 10;
r = 10;
A = sort(randi(1000,p,q,r,'uint16'),2);
% Method 2: faster when p is relatively small
A_unique = A; % create a copy of A
true_col = true(p,1,r);
A_unique_positions_logical = [true_col,diff(A,1,2) > 0]; % logical index of unique 'representatives' of the values of each row
A_unique(~A_unique_positions_logical) = 65535; % the values of A are not too big and can be bounded from above safely (see below)
A_unique = permute(A_unique,[1 4 3 2]);
C = max(permute(sum(uint8(A == A_unique),2,'native'),[1 4 3 2]),[],2); % compares each row with its respective row of unique values
% counts the occurances of each unique value in this row and then takes max(), 65535 has no matches => only unique values are counted
But this routine is not very suitable when p is as big as in my use case. So, the routine I am currently using is via computing an incidence matrix:
% Input:
p = 500000;
q = 10;
r = 10;
A = sort(randi(1000,p,q,r,'uint16'),2);
% Method 3: faster when p is bigger
I_mat = VChooseK(uint8(1:q),2);
m = size(I,1); % i.e. m = nchoosek(q,2), which is small, since q <= 10
Incidence_mat = ones(p,q,r,q,'uint8');
for k=1:m % this is a relatively small loop, comparing all the 2-combinations of the columns of A
i = I_mat(k,1); j = I_mat(k,2);
A1_col = A(:,i,:);
A2_col = A(:,j,:);
Eq_tmp = uint8(A1_col == A2_col);
Incidence_mat(:,i,:,j) = Eq_tmp;
Incidence_mat(:,j,:,i) = Eq_tmp;
end
C = max(sum(Incidence_mat_gpu,4,'native'),[],2);
I am hoping that Method 1 can be adapted to 2-3D matrices in a way that is faster than the other two routines. Any help is much appreciated!
Lastly, let me mention that the vectorization routines are intended to be run on a GPU.

Accepted Answer

Bruno Luong
Bruno Luong on 19 Feb 2023
Edited: Bruno Luong on 19 Feb 2023
p = 3;
q = 10;
r = 2;
A = sort(randi(10,p,q,r,'uint16'),2)
A = 3×10×2 uint16 array
A(:,:,1) = 1 1 3 6 6 6 6 7 7 10 2 4 6 6 7 7 7 8 9 10 1 2 3 4 4 5 5 6 6 8 A(:,:,2) = 1 1 4 4 5 6 6 7 8 10 1 1 1 3 3 3 4 4 6 10 1 1 1 2 3 4 4 4 9 10
B = reshape(permute(A,[2 1 3]),q,[]);
t = true(1,width(B));
[i,j] = find([t;diff(B,1,1);t]);
ll = accumarray(j,i,[],@(i)max(diff(i)));
ll = reshape(ll,p,r);
disp(ll)
4 2 3 3 2 3
  4 Comments
Bruno Luong
Bruno Luong on 19 Feb 2023
Edited: Bruno Luong on 19 Feb 2023
This modification should work for gpu (and cpu)
p = 3;
q = 10;
r = 2;
A = sort(randi(10,p,q,r,'uint16'),2)
A = gpuArray(A);
B = reshape(permute(A,[2 1 3]),q,[]);
t = true(1,width(B));
[i,j] = find([t;diff(B,1,1);t]);
di = [-q; diff(i)];
ll = accumarray(j,di,[],@max);
ll = reshape(ll,p,r);
disp(ll)

Sign in to comment.

More Answers (2)

John D'Errico
John D'Errico on 19 Feb 2023
When you have a question where there is some specific issue that you should know is a problem, TELL US ABOUT IT. Don't keep it a secret, only saying something when you get an answer. And needing to use a GPU, or the parallel computing TB in general is such an issue. Most people don't have it.
You have an array, and you want to find the longest constant run of ANY value in each row. For example...
A = randi(3,10,15)
A = 10×15
3 1 3 2 2 3 3 3 1 3 2 1 3 3 3 1 3 3 2 3 2 1 3 1 1 2 3 3 3 2 3 2 1 1 3 1 3 2 3 3 1 1 1 2 3 3 1 3 1 3 1 3 3 1 3 3 1 3 1 2 3 1 2 1 3 2 2 2 2 1 1 3 2 2 3 3 3 3 3 3 3 2 2 3 2 1 2 2 2 3 3 1 3 1 1 2 3 3 3 2 2 2 1 3 3 2 1 1 3 3 1 3 1 3 1 3 3 1 2 1 2 1 3 2 2 3 3 3 1 2 2 2 3 1 2 1 1 3 3 2 1 2 1 3 2 2 2 2 2 1
Start by identifying ALL such runs. The simplest way to do this is via diff.
n = size(A,1);
Acon = [zeros(n,1),~diff(A,[],2)]
Acon = 10×15
0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 0 1 0 0 1 1 1 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 1 1 0
Any 1 in a row of Acon indicates a location where we found a sequence of constant elements in that row. Runs of ones indicate longer runs of constant elements. So now the problem reduces to finding the length of the longest run of ones in each row. Again though, here we have a problem of knowing what codes will work for you, using a GPU.
So, next, try this:
conv2(Acon,[1 2 1])
ans = 10×17
0 0 0 0 1 2 2 3 3 1 0 0 0 1 3 3 1 0 0 1 2 1 0 0 0 0 1 2 1 1 3 3 1 0 0 0 0 1 2 1 0 0 0 1 2 2 3 3 1 0 0 0 0 0 0 0 0 0 1 2 1 1 2 1 0 0 0 0 0 0 0 0 0 0 1 3 4 3 2 2 1 1 2 1 0 0 1 3 4 4 4 3 2 2 1 0 0 1 3 3 1 0 0 0 0 0 1 2 1 1 3 3 2 3 3 1 1 2 1 0 0 1 2 2 2 1 0 0 0 0 1 2 1 0 0 0 0 0 0 0 1 2 2 3 3 1 1 3 3 1 0 0 0 0 1 2 2 2 1 0 0 0 0 1 3 4 4 3 1 0
What you will see is
max(conv2(Acon,[1 2 1]),[],2)
ans = 10×1
3 3 3 2 4 4 3 2 3 4
tells you the length of the longest sequence of constant elements in a row, as long as that sequence is not longer than 4. And for those rows where there are more then 4 elements in a run, it will report a run of 4's.
So if you expect that there would rarely be runs of length 5 or more, then you could just use the single line code above, then check each row where a 4 was reported, to see if there was actually a run of length 5 or more. And that is easy enough to check. But the nice thing is, the above code will be extremely fast.
  3 Comments
John D'Errico
John D'Errico on 19 Feb 2023
Its hard to say. I have a funny feeling that with a creative use of conv, I could identify longer runs too. But I think that requires me to know the maximum run length.
Another idea is to use a run length encoding on the rows of your matrix, expecially once it has been converted to a binary form as I did with Acon. So if each row were run length encoded, then the problem reduces to finding the longest run of ones. But I'm not sure that would be directly GPU-able.
Or, you could use various image processing ideas. Again, not sure how well they would work for you. Regardless, I think Bruno has come up with a variation that will be GPU-able.
Marin Genov
Marin Genov on 19 Feb 2023
Edited: Marin Genov on 19 Feb 2023
Your idea about using convolution is very interesting, especially since it's orders of magnitude faster than all of the other alternatives. So I will try to think about it some more and see if I can get anywhere for the general case. At any rate, the maximum run length has a trivial, but attainable upper bound, namely the width of the matrix, and this cannot be changed without padding the matrix with further zero columns.

Sign in to comment.


Marin Genov
Marin Genov on 2 Mar 2023
I had a bit of fever and remembered the following little trick to vectorize find row-wise that should have been obvious from the start :-) Here is the full function:
% Input:
p = 500000;
q = 10;
r = 25;
A = sort(randi(1000,p,q,r,'uint16'),2);
% New Method:
tic;
q_vec = uint16(1:q+1);
Ones_col = ones(p,1,r,'uint16');
A_diff = uint16(logical(diff(A,1,2))); % length of run of 0-s = difference of the positions of the 1-s bounding them
A_diff_padded = [Ones_col,A_diff,Ones_col]; clear A_diff; % making sure there is always a boundary of ones
A_pos = A_diff_padded.*q_vec; clear A_diff_padded q_vec; % 1-s are replaced by their positions
A_pos_cumax = cummax(A_pos,2); clear A_pos; % the positions of boundary 1-s become neighbours
A_pos_cumax_diff = diff(A_pos_cumax,1,2); clear A_pos_cumax; % difference of the positions of 1-s
Count_col = max(A_pos_cumax_diff,[],2); clear A_pos_cumax_diff;
toc;
This is much faster than all of the above general methods, and in terms of performance it seems on par with the convolution method, except it's not limited to runs of 4 or less elements. Basically, find is replaced by a multiplication with 1:w, where w is the width of the 3D matrix, and a cummax. I also added a few clear because I ran out of memory before I had run out of performance.

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!