Using randperm in a loop

20 views (last 30 days)
Joel Schelander
Joel Schelander on 14 Apr 2021
Answered: Joel Schelander on 14 Apr 2021
I have a 3x3 cell called VID1. I want to find all possible combinations and store them in a matrix.
Currently Im using the code below. As I will work with HUGE cells later on I need to optimize it:
So what I want is to loop all the combinations in the best way. The code below can select a combination several times unnecessarily.
for oz=1:999
index=randperm(numel(VID1),2);
if ismember(index,a1)==1
continue
end
a1=[a1; index];
end
As an example:
if index=[1 2]
then that combination should no longer be possible
  4 Comments
Rik
Rik on 14 Apr 2021
My comment showed you why you should not be using a random generator to generate all indices.
A nested loop will work perfectly. I suspect the reason it is too slow for you, is that you want to do many combinations. Many combinations take a lot of time. There isn't really a way out of that. Matlab is not a magic wand.
The only way to reduce the processing time, is if there is an actual vectorized solution. You could implement a convolution with a nested loop, but the conv function uses some trick to do it a lot faster. If there is no such trick for your process, you're stuck.

Sign in to comment.

Accepted Answer

Jan
Jan on 14 Apr 2021
a1=[];
for oz=1:999
index=randperm(numel(VID1),2);
if ismember(index,a1)==1
continue
end
a1=[a1; index];
end
This code collects pairs of random numbers until all elements occur in the result. In opposite to this, in the description you mention all possible combinations. The code does something else. If you are looking for the conmbinations, you need the 'rows' flag for ismember().
ismember replies a logical flag already, so you save time, if you omit the " == 1".
The continue statement bends the program flow. Avoid it, if possible. Here I've inserted the not operator ~ to achieve this:
a1 = [];
for oz = 1:999
index = randperm(numel(VID1), 2);
if ~ismember(index, a1, 'rows')
a1 = [a1; index];
end
end
Now we have a working start point, which is faster already. But a general problem of such random so called "shotgun appraochs" is, that you cannot guarantee, that all solutions are found in a finite amount of time. In addition it wastes a lot of time with checking already existing solutions. With a larger input, this problem grows exponentially. This term should trigger the alarm bells of programmers, because it means, that the run time explodes to millions of years soon.
Your contains a second exponentially growing time eater: The output a1 is growing iteratively. This consumes a surprising huge amount of resources, because with each new element, Matlab has to reserve a new larger array and copy the former values. An example:
x = [];
for k = 1:1e6
x = rand;
end
The output x needs 8 MB only (8 bytes per double). But Matlab reserves ways more memory: 1 element in the first iteration, 2 elements in the 2nd iteration, etc. Finally sum(1:1e6)*8 Bytes are allocated in the RAM, cleared by overwriting with zeros and the former values of almost the same size are copied: 4 TeraByte!
This can be avoided easily by a pre-allocation:
x = zeros(1, 1e6);
for k = 1:1e6
x(k) = rand;
end
Now Matlab reserves 8MB and uses it.
Let's come back to the actual problem: "find all possible combinations and store them in a matrix" and "HUGE" cell.
By the way, avoid terms like "HUGE" in the forum. Some users think, that 200 elements are huge already, because they cannot type their values in manually, others are working with Petabytes of data on a compute cluster and call it "medium sized", because it takes less then a week so process it. Prefer to post absolute numbers instead.
Getting all combinations of large arrays has a severe aspect: The size of the result grows very fast. So before you start, you have to determine the size of the output. For getting 2 indices out of n elements, you get n * (n-1) pairs. Check, if your HUGE * (HUGE-1) * 8 matchs into your RAM at all. If not, you need another solution.
If it does match, think twice: This sounds like a common problem and somebody has solved it already. In you case, there is even a toolbox function:
a1 = nchoosek(1:9, 2)
For HUGE input, this is slow also, because the output is nearly HUGE^2. In the FileExchange you find some faster implementations. Search there for "combination" (if repetiions are wanted as [1,1]) or "permutation" (no repetitions).
You see, that your question opens a wide field and that you can learn a lot from this job.

More Answers (1)

Joel Schelander
Joel Schelander on 14 Apr 2021
Thanks

Products


Release

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!