Select random data from a matrix and replace it
2 views (last 30 days)
Show older comments
stelios loizidis
on 14 May 2019
Commented: Jos (10584)
on 16 May 2019
Hello,
I have the following problem: I have a matrix e.g
1 0 1 0 1
1 1 1 0 0
1 1 1 1 1
1 1 0 0 1
and I try to do this: When the sum of each column exceeds the threshold=4 some random "1" of the column that goes beyond the threshold become zero. How do I implement this in matlabThis can be done without using loops?I would be very grateful if someone helpend me
0 Comments
Accepted Answer
Andrei Bobrov
on 14 May 2019
Edited: Andrei Bobrov
on 15 May 2019
A = [1 0 1 0 1
1 1 1 0 0
0 0 0 0 0
1 1 1 1 1
1 1 0 0 1
0 0 0 0 0];
p = 4;
[ii,jj] = find(A);
jjj = accumarray(ii,jj,[size(A,1),1],@(x){x(randperm(numel(x),min(numel(x),p)))});
k = cellfun(@numel,jjj);
out = accumarray([repelem((1:numel(k))',k),cell2mat(jjj)],1,size(A));
5 Comments
More Answers (2)
Jos (10584)
on 15 May 2019
Edited: Jos (10584)
on 15 May 2019
Here is another, indexing, approach:
A = randi(2, 6, 8)-1 % random 0/1 array
M = 3 % max number of 1's per column
szA = size(A) ;
B = zeros(szA) ;
tf = A == 0 ;
[~, P] = sort(rand(szA)) ; % randperm for matrix along columns
P(tf) = 0 ;
[~, r] = maxk(P, M) ; % rows with M highest values in P (including 0's) per column
i = r + szA(1)*(0:szA(2)-1) ; % convert to linear indices
B(i) = 1 ; % B contains M 1's per column
B(tf) = 0 % reset those that were 0 in A -> only maximally M 1's per column remain
2 Comments
Jos (10584)
on 16 May 2019
I don't get it. Why not simply run the code only once for a given matrix A?
Jan
on 16 May 2019
Edited: Jan
on 16 May 2019
A logical mask is much simpler than handling the indices:
A = randi([0,1], 8, 8); % Test data
p = 4;
mask = (cumsum(A, 1) > p) & A;
A(mask) = randi([0,1], nnz(mask), 1);
This replaces 1s in each column by a 0 with a probability of 50%, but leaves the first p ones untouched.
But if you want to change any 1s without keeping the first p 1s of each column:
mask = (sum(A, 1) > p) & A; % cumsum -> sum, auto-expand: >= R2016b
A(mask) = randi([0,1], nnz(mask), 1);
If you want to replace the 1s in the columns with more than p 1s with another probability:
A(mask) = rand(nnz(mask), 1) > 0.85;
3 Comments
Jos (10584)
on 16 May 2019
From the OP: "When the sum of each column exceeds the threshold=4 some random "1" of the column that goes beyond the threshold become zero."
If I now read it again, just a single one in such a column should be set to 0 then? ...
See Also
Categories
Find more on Data Preprocessing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!