random select n elements from n arrays
7 views (last 30 days)
I need random select 50 elements from 50 lists to build a new array, each list includes 100 elements, then apply some post sorting processing on all possible new arrays to find the indexes of all 50 elements for the best array.
The question is there will be possibility for the combination of the new array. How to realize this work in MATLAB? Can we try Mapreduce?
this problem feels like I have 50 classes and each class have 100 students, and i need find the 'best' team with 50 students from 50 classes. each student has a unique score/contribution for the team, so i scan all possible teams and sort them to find the one i want as well as the student ID in each class.
I believe what they may be after is the (what should be more famous) "Optimal Stopping Strategy", or the "37% rule". They might have discussed it in your course. Basically it says that it says that you review the first 37% of the possibilities without selecting any of them. Then you start reviewing the possibilities with the rule that the first one you encounter that is better than any of them you have seen before is the one you should pick. If you do that then there is a 37% chance you have picked the optimal possibility.
Pretty weird, right? However there is actually some theory behind it.
Here is a web page with more information including a table and more complicated twists on the problem.
However even 37% of 100^50 is too big, but if your course did not have some particular strategy in mind, Wikpedia has some you can try: https://en.wikipedia.org/wiki/Optimal_stopping
More Answers (4)
How do you figure 100^50? You have 50 lists of 100 elements from which you are supposed to select half the elements randomly. So you will have 50 output vectors. You can use randperm()
For one vector, let's call it vec1, you can do
randomIndexes = randperm(length(vec1), 50); % Get 50 random indexes.
% Now extract those into vec1Output:
vec1Output = vec1(randomIndexes);
% Apply "some sorting processing":
vec1Output = sort(vec1Output, 'ascend'); % Or whatever you want to do.
There, just do that also for vec2 through vec50 and you'll have 50 output vectors.
Peter O on 4 Aug 2021
With a little programming and a for loop you can handle all possible combinations.
If you want to be able to draw repeated elements from the lists (e.g. the new list could include index 10 multiple times), use randi. Otherwise, use randperm.
% Generate some fake lists as a single array, each having 100 entries.
n_lists = 50;
n_entries = 100;
Lists = rand(100,n_lists);
n_pick = 50;
% We know the size of the draw. Preallocate.
NewLists = zeros(n_pick, n_lists);
% Generate the indices to draw at random.
Indexes = randperm(size(Lists,1), n_pick);
% or if you want to allow repeats:
% Indexes = randi(size(Lists,1),n_pick,1);
NewLists(1:n_pick, ix) = Lists(Indexes, ix);
% Sort as-needed here. Since the lists here are ordered as a single matrix
% you can handle it with a single call. sort called without addiitonal
% arguments will sort individual columns
NewLists_sorted = sort(NewLists)
Walter Roberson on 4 Aug 2021
Edited: Walter Roberson on 4 Aug 2021
Lists = arrayfun(@(idx) randi(99,1,100), 1:50, 'uniform', 0);
one_random_selection = cellfun(@(L) L(randi(100)), Lists)
If you want to find all of the unique sort()'s, that would be a different matter entirely.