Faster alternative to mode?

Simple as that, in my code I am first determining the neighbourhood of a voxel and then determining the most frequent value in the neighbourhood. I am doing this for millions of voxels, so the computation time adds up. I am expecting this to take quite some time, however my entire code has been running for about 2.5 days now with no end in sight and I've determined that the bottleneck is this section of my code. Using profiler it appears that the function "mode" takes up almost 50% of the computation time of the this function, and about 25% of my entire code.
My method of determining the neighbourhood is the next slowest part of the code using: neighbourhood = im(i-1:i+1,j-1:j+1,k-1:k+1);
Does anyone have a good alternative to "mode" i can use in this situation? If anyone has any more efficient suggestions for finding the neighbourhood of the voxels that would also be appreciated.
Thanks!
for i = 1:size(lcoords,1)
%Get neighbourhood of voxel i from list of image voxel subscripts
neighbourhood = im(lcoords(i,1)-1:lcoords(i,1)+1,lcoords(i,2)-1:lcoords(i,2)+1, lcoords(i,3)-1:lcoords(i,3)+1);
neighbourhood(neighbourhood == imcoords(i,2)) = []; %Ignore parts of the neighbourhood that have same value as current voxel
if ~isempty(neighbourhood) %If the voxel is surrounded by any values not including its own region, find the mode of the neighbourhood
imcoords(i,3) = mode(neighbourhood(:)); %record the mode
end
end

5 Comments

Please show your code
The only idea i have is to change these lines
neighbourhood(neighbourhood == imcoords(i,2)) = []; %Ignore parts of the neighbourhood that have same value as current voxel
if ~isempty(neighbourhood) %If the voxel is surrounded by any values not including its own region, find the mode of the neighbourhood
On
tmp = neighbourhood == imcoords(i,2)
if any(~tmp)
I think it would be faster
What about mode? Is there something to simplify there?
Example:
tic
for i = 1:1e6
a = ones(5);
a(3) = [];
end
toc
tic
for i = 1:1e6
a = ones(5);
a(3) = 0;
end
toc
Because ofre-sizing the arrays on each iteration
Thanks, I will change those lines, but the real bottleneck is the mode function. Thats what I would like to simplify as it takes up 50% of the time in my function.

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 7 Apr 2020
Edited: Matt J on 7 Apr 2020
Yes, actually there are no zeros in this matrix. -1s could be changed to zeros as they represent void space that I am uninsterested in. However the point of the code is to identify voxels that touch void space (at least one -1) and determine which value that voxel is touching the most.
Starting another answer for this line of solution. I have a class on the File Exchange that can store 3D arrays in sparse form.
Not only could this save you a lot of memory, but it sounds like at least some of what you're trying to do can be done via sparse 3D convolution with a 3x3x3 kernel of ones. This can be done using the convn method of the class. For example, below I create a 3D binary image A the same dimensions as yours. It contains about 1.7 million non-zeros and consumes 26 MB. Only one of the 1's is not touching any zeros. Using convolution, the code calculates a separate binary matrix B with all elements not touching any zeros removed. Since there is only one such element, B contains only 1 fewer element than A:
>> A=spfun(@ceil, ndSparse.sprand([2854,2906,2013],1e-4) );
>> A(1:3,1:3,1:3)=1;
>>
>> whos A
Name Size Kilobytes Class Attributes
A 2854x2906x2013 26102 ndSparse
>>
>> tic; B=A-(convn(A,ones(3,3,3),'same')>26.999); toc;
Elapsed time is 12.614820 seconds.
>>
>> nnz(A)
ans =
1669474
>> nnz(B)
ans =
1669473

21 Comments

Hi Matt,
This seems to work well however I think I mispoke before. I am not only interested in elements that touch at least one zero, I am actually interested in any element that touches at least one number that is not the same as the element's value. Is there a way to manipulate the convolution to achieve this?
If val is the target value, just apply the technique with A defined as the following binary image,
A=(im==val);
mmh okay but my matrix has thousands of values so wouldn't I have to do the convolution thousands of times? If it takes 12 seconds each time, thats going to be even slower than before.
Matt J
Matt J on 8 Apr 2020
Edited: Matt J on 8 Apr 2020
So, the idea is to find every voxel that differs from at least one of its 26 neighbors? If so, you can adapt my earlier answer,
[di,dj,dk]=ndgrid(-1:+1);
im=ndSparse(im); %Must contain a vast majority of zeros
A=ndSparse.spalloc(size(im),4*nnz(im)); %accumulator
for n=1:27
A=A+abs(im-circshift(im,[di(n),dj(n),dk(n)]));
end
result=A>0;
Okay great, so now I have the following code, but the function mode doesn't seem to work for the class type ndSparse which the variable neighbours ends up being. Is there a way to convert eneighbours back to a regular array? Is that necessary or is there an easier method I'm missing?
A(im==0) = 0; %Dont care about zeros that touch non-zeros
ccoords = find(A(:)); %Get coordinates of 'border' elements
ccoords(:,2) = coords(ccoords(:,1)); %Get values of 'border' elements
[lcoords(:,1),lcoords(:,2),lcoords(:,3)] = ind2sub(size(im),ccoords(:,1)); %Get coordinates of 'border' elements
ccoords(:,3) = 0; %Initialize 3rd row of ccoords
tic
%Find what is mostly surrounding each 'border' element
for i = 1:size(lcoords,1)
neighbourhood = im(lcoords(i,1)-1:lcoords(i,1)+1,lcoords(i,2)-1:lcoords(i,2)+1, lcoords(i,3)-1:lcoords(i,3)+1);
neighbourhood(neighbourhood == ccoords(i,2)) = 0;
ccoords = mode(neighbourhood(:));
end
toc
Matt J
Matt J on 8 Apr 2020
Edited: Matt J on 8 Apr 2020
For the mode calculation, you can also adapt my original answer. This will generate a 3D image of the neighborhood modes:
im=ndSparse(im); %should contain a vast majority of zeros
[di,dj,dk]=ndgrid(-1:+1);
Ishifts=repmat(im,1,1,1,27);
for n=1:27
Ishifts(:,:,:,n)=circshift(im,[di(n),dj(n),dk(n)]);
end
result=mode( sparse2d(Ishifts) , 2);
result=ndSparse(result, size(im));
Thanks again for the quick reply! After result=mode( sparse2d(Ishifts) , 1); i get a 1x27 ndSparse matrix of all zeros and then an error on the next line:
Error using ndSparse (line 126)
The number of DATA elements is not consistent with the given ndShape.
Whoops, that should really be
result=mode( sparse2d(Ishifts) , 2);
Okay great, so that worked for every pixel in the image, now to get the mode of the neighbourhood for only the pixels on the border elements I combined your two answers (see code). However, I am sitll not getting the same results that my original method achieved (which I have validated to be correct). I believe the problem is that the first loop that produces A which should only contain border elements, seems to contain a double layer of elements where the border is between a phase and void space. This results in the majority of these border elements to have a neighbourhood mode of the same phase they are apart of.
Any idea what would cause this "double layer"?
Also the second loop takes nearly a minute to complete on an image of size 502x445x320 that I am using for testing. This is much longer than my previous method. Do you think this will increase significantly with a larger image like my original code?
[di,dj,dk]=ndgrid(-1:+1);
im(im==-1) = 0;
im = ndSparse(im);
A = ndSparse.spalloc(size(im),4*nnz(coords));
tic
for n = 1:27
A = A + abs(im-circshift(im,[di(n),dj(n),dk(n)]));
end
toc
tic
Ishifts=repmat(im,1,1,1,27);
for n=1:27
Ishifts(:,:,:,n)=circshift(im,[di(n),dj(n),dk(n)]);
end
result=mode( sparse2d(Ishifts) , 2);
result=ndSparse(result, size(im));
toc
ccoords(:,1) = find(A);
ccoords(:,2) = im(ccoords(:,1));
ccoords(:,3) = result(ccoords(:,1));
Matt J
Matt J on 9 Apr 2020
Edited: Matt J on 9 Apr 2020
Any idea what would cause this "double layer"?
Did not understand this part at all.
Also the second loop takes nearly a minute to complete on an image of size 502x445x320 that I am using for testing. This is much longer than my previous method. Do you think this will increase significantly with a larger image like my original code?
The speed won't depend on the dimensions of the image. It will depend on the number of non-zeros that it contains. However, you can try the following modification of the second loop to see if it accelerates things.
tic
Ishifts=cell(27,1);
for n=1:27
Ishifts{n}=circshift(im,[di(n),dj(n),dk(n)]);
end
Ishifts=cat(4,Ishifts{:});
result=mode( sparse2d(Ishifts) , 2);
result=ndSparse(result, size(im));
toc
That is a bit faster, thank you! As for the double layer thing, nevermind I figured out the problem. I think I just have one more issue I need to solve. I forgot that the original code calculates the mode but ignores values in the neighbourhood that are equal to the element value. Basically it calculates mode with exception to the current element's value. I achieved this by first eliminating all values in the neighbourhood that equaled the current element's value and then calculating the mode.
Since in this case mode is calculated all in one line for the entire matrix, I am unsure how to achieve this. Any ideas?
Thank you for your continued help with this!
Matt J
Matt J on 10 Apr 2020
Edited: Matt J on 10 Apr 2020
Perhaps as follows:
tic
Ishifts=cell(27,1);
for n=1:27
Ishifts{n}=circshift(im,[di(n),dj(n),dk(n)]);
end
Ishifts=sparse2d( cat(4,Ishifts{:}) );
idx=logical(Ishifts(:,14));
Neighbs=full(Ishifts(idx,:));
Neighbs(Neighbs==Neighbs(:,14))=nan;
result=ndSparse(double(idx),size(im));
result(idx)=mode(Neighbs,2);
toc
Yes this worked well! Thank you for all of your help. I have accepted this answer.
However, on the larger image it has been running for about two days now and still has not finished this step. I am unsure how the timing of this scales with the number of zeros/ones in the image so I am not sure if my original method will be faster than this new one.
Do you still think this will be faster than using the mode function as I was doing previously? And either way do you have any ideas to make this new method faster? Thanks again!
Matt J
Matt J on 12 Apr 2020
Edited: Matt J on 12 Apr 2020
The "larger image" means the 2854x2906x2013 version of im? How many non-zeros does it have?
Yes thats correct. There are 469,194,319 non-zeros in the image. The ratio between the non-zeros and zeros is 0.0289.
Does it successfully convert to ndSparse? How much memory does it consume? I would expect about 2.8 GB.
Also, does the calculation of A complete successfully? How many nonzeros does A have?
A = ndSparse.spalloc(size(im),4*nnz(coords));
tic
for n = 1:27
A = A + abs(im-circshift(im,[di(n),dj(n),dk(n)]));
end
toc
It converts to ndSparse without issue, but it actually took up 30gb of memory which is nearly twice as much as the image.
I believe I made a typo in my code that I sent to you which resulted in a typo in what you just sent. Should the variable, 'im' be the only input in this line?:
A = ndSparse.spalloc(size(im),4*nnz(im))
I ran it as such and double checked it was about 30gb.
Matt J
Matt J on 15 Apr 2020
Edited: Matt J on 15 Apr 2020
If 30GB is the memory consumption of "im" and not "A" then something doesn't add up. If "im" has 469,194,319 non-zero elements, then in type double, it should be consuming this many GB:
>> 469194319*8*3/2^30
ans =
10.4873
I can imagine it being a little bit more due to overhead of sparse matrix storage, but not by a factor of 3.
The line
A = ndSparse.spalloc(size(im),4*nnz(im))
was as I intended, but I didn't know at the time that "im" was 10+ GB. In any case, we still need to know how many non-zeros are in A. Did the computation of A complete? You could probably reduce the pre-allocation to 2*nnz(im).
I'm not sure what to tell you but when I create A its 30 GB, when i reduce the pre-allocation as you said it becomes 15 GB.
On the computer I was running the code on, the allocation of A has not completed because another process is actually taking longer to finish which is stranged because in the smaller image that process does not usually take a signifcant amount of time. That something I will have to look into.
Back to this problem though, I was actually getting an out of memory error when trying those lines but I was accidentally using the non-sparse version of the image. I will let you know when and if it finishes.
A finished and had 205,097,786 non-zeros
Ah well. Sadly, it's not nearly as sparse as I'd hoped and not enough for you to get an advantage out of ndSparse representation. The mode calculation is going to allocate at least 130 GB with that many non-zeros.
I would say your best bet is to do the processing in chunks like I recommended in my first answer. What I envision would be to process im(:,:,1:100), then im(:,:,100:199), then im(199:298) and so on. The overlap is deliberate, so that every 3x3x3 neighborhood is captured in the partitioning.

Sign in to comment.

More Answers (1)

Matt J
Matt J on 6 Apr 2020
Edited: Matt J on 6 Apr 2020
It would be better not to implement the mode calculation repeatedly in a loop over the voxels. If you have enough RAM to hold 27 copies of your image volume, then this is one way to replace the loop with a vectorized calculation:
[di,dj,dk]=ndgrid(-1:+1);
Ishifts=repmat(im,1,1,1,27);
for n=1:27
Ishifts(:,:,:,n)=circshift(im,[di(n),dj(n),dk(n)]);
end
result=mode( Ishifts , 4);

6 Comments

Hi Matt,
Thank you for your answer. My image is 16 GB in 8-bit format and the version of the image I am using for this step must be 32-bit which increases the size even more. The maximum RAM i have access to is 128 so there is no way I could even fit 27 copies of the 8-bit image.
Thank you for trying anyways. If you have any alternative ideas please let me know!
Matt - I'd bet a decent sum of money the problem is mainly pre-allocation, or the lack thereof.
In the code as shown by the op, the variable imcoords is never pre-allocated, yet grown millions of times. That would surely explain the multiple day run time as seen.
Matt J
Matt J on 7 Apr 2020
Edited: Matt J on 7 Apr 2020
The maximum RAM i have access to is 128 so there is no way I could even fit 27 copies of the 8-bit image.
That's less convenient but it is still no reason to go voxel-by-voxel. You can break the image into smaller (overlapping) sub-chunks and apply my proposal to each of the sub-images.
My image is 16 GB in 8-bit format
What are the dimensions of this image?
The version of the image I am using for this step must be 32-bit
That's not possible. If the original data is 8 bit, there is no way the neighborhood modes require more than 8 bits to represent.
Matt - I'd bet a decent sum of money the problem is mainly pre-allocation, or the lack thereof.
John - even if a missing pre-allocation step is remedied, looping element by element through a 16GB image will still make things impractically slow.
That's inconvenient but it is still no reason to go voxel-by-voxel. You can break the image into (overlapping) sub-chunks and apply my proposal to that.
Okay, I haven't had time to try to implement your code in that way. I will try that.
What are the dimensions of this image?
2854x2906x2013
That's not possible. If the original data is 8 bit, there is no way the neighborhood modes require more than 8 bits to represent.
The 32-bit image has replaced the formerly binary values of the image with numbers ranging from -1 to greater than 255.
Matt - I'd bet a decent sum of money the problem is mainly pre-allocation, or the lack thereof.
John - even if a missing pre-allocation step is remedied, looping element by element through a 16GB image will still make things impractically slow.
There is preallocation for every variable shown, I just only included the parts that I felt were relevant to the problem.
The 32-bit image has replaced the formerly binary values of the image with numbers ranging from -1 to greater than 255.
Is there any sparsity that you can take advantage of? Does the image contain mostly zeros?
Yes, actually there are no zeros in this matrix. -1s could be changed to zeros as they represent void space that I am uninsterested in. However the point of the code is to identify voxels that touch void space (at least one -1) and determine which value that voxel is touching the most.
For example, if a voxel of value 1 touches at least one -1, I am interested in it and will then determine whether it is touching mostly other voxels of another positive value or mostly -1s.
I have not worked with sparse matrices very much. What is your idea?

Sign in to comment.

Categories

Asked:

on 6 Apr 2020

Commented:

on 15 Apr 2020

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!