Clear Filters
Clear Filters

All combinations of a (minimum number of) elements in a vector that sum greater or equal to a given number

6 views (last 30 days)
I am trying to find combinations of elements of a vector whose sum is greater or equal to a given value. For example:
vec = [5 4 3 2 1];
criteria = 8.5;
Then the output would look like [5 4] [5 3 2] [5 3 1] [4 3 2]
I have tried several "nchoosek" approaches including 1 on the FEX, namely nchoosecrit. But there are two issues with the "list all possible combinations" approach:
1) I do not need all possible combinations. Note for example that the answer [5 4 3] also satisfies the criteria, however the last "3" is superfluous.
2) More importantly, the actual vector length contains 130 elements, so the "nchoose..." approach will yield 2^130 - 1 candidate solutions, which is clearly infeasible.
Any help would be much appreciated.
  8 Comments
Rik
Rik on 23 Nov 2017
@Jan, thanks for the correction. I didn't remember what it was exactly, I thought it would be 100. I only came across the limit when I tried to implement the Ackermann function.
Hamad Alsayed
Hamad Alsayed on 24 Nov 2017
@Jan, I'm afraid not. But I am looking at different ways to solve this, e.g. breaking it into smaller problems. Thanks!

Sign in to comment.

Answers (1)

John D'Errico
John D'Errico on 24 Nov 2017
As it is, I'm afraid this is one of those combinatorially nasty problems that cannot be solved effectively using brute force, at least on current computing systems. I see that you are looking for only minimal sets, in that you said that [5 4 3] would not be considered with a lower limit of 8.5, since the 3 was superfluous in your example.
But lets consider an example problem, consistent with what you have said.
S = sort(randperm(250,130))
S =
Columns 1 through 28
2 4 5 6 8 12 14 15 16 18 20 22 23 25 29 30 31 32 34 35 36 37 38 39 42 45 47 49
Columns 29 through 56
50 51 53 54 61 62 64 65 66 67 75 79 80 82 84 87 88 89 92 94 98 99 100 102 103 104 107 108
Columns 57 through 84
109 110 115 116 117 119 121 122 123 124 128 132 134 135 141 142 143 144 145 146 147 149 152 154 156 159 161 162
Columns 85 through 112
163 165 167 168 170 171 172 173 174 175 176 177 178 179 181 183 185 188 189 194 199 200 202 203 204 206 207 208
Columns 113 through 130
211 215 216 217 218 219 220 221 225 226 229 231 232 237 239 240 244 249
You don't tell us what the lower limit would be. But I'll pick a number, say 2500. Can you find all such minimal sets that sum to say, just over 2500? First of all, how many elements would appear in such a candidate set? Since the average element of S is roughly 125 and the maximum 250, we would need at least 10 elements in any such set, and on average roughly 20.
nchoosek(130,20)
Warning: Result may not be exact. Coefficient is greater than 9.007199e+15 and is only accurate to 15 digits
> In nchoosek (line 92)
ans =
1.6736e+23
So if we were to look for subsets that sum to just over 2500 from S, there will be roughly 1e23 such sets to generate. Sets of such size are simply too numerous to work with. Could you even store them? It would require roughly
1e23*20*8
ans =
1.6e+25
1.6e25 bytes of memory. 1.6e13 terabytes of memory.
You never gave a realistic example, so I had to make one up. But as has been said multiple times in the comments, this problem is simply too large to handle by computing all such subsets.
You need to consider if there are alternative approaches that would be acceptable. Again, we have never been given a realistic example. Nor have we ever been told what you are trying to do. It is often the case that someone asks a problem, but without explaining why they are trying to solve their problem as they wish. We may only be able to tell them that they simply cannot solve it as they wish, without being able to say there MAY BE another way to completely reformulate the problem that is tractable.
Realistically however, you need to look for a way to reformulate this. You may not be able to solve it until there are computers readily available that can utilize trillions of terabytes of memory, and that are fast enough to do all the computations needed. And if that lower limit were somewhat higher in the example that I created, then a computer the size of the entire universe might be insufficient.
nchoosek(130,65)
Warning: Result may not be exact. Coefficient is greater than 9.007199e+15 and is only accurate to 15 digits
> In nchoosek (line 92)
ans =
9.5068e+37
ans*65*8
ans =
4.9435e+40
  1 Comment
Hamad Alsayed
Hamad Alsayed on 4 Dec 2017
Hi John, thank you very much for the detailed explanation. The intractability of the problem is very clear.... simple to state, difficult to solve. Thanks again!

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!