divide the matrix (Rx2) into submatrices based on the values of the second column
2 views (last 30 days)
Show older comments
Alberto Acri
on 20 Sep 2023
Commented: Dyuman Joshi
on 22 Sep 2023
HI! I tried to split the 'matrix_out' matrix into submatrices with steps of 0.1 and for the most part I succeeded.
load matrix_out
% =======
matrix_out_0 = matrix_out(matrix_out(:,2) < 0.1, :);
tot_percent_matrix_out_0 = sum(matrix_out_0(:,2));
matrix_separation_0 = [{matrix_out_0}, tot_percent_matrix_out_0];
% =======
matrix_separation = {};
j = 0.1:0.1:1.2;
for K = 1:width(j)
matrix_out_new = matrix_out((matrix_out(:,2) >= j(K) & matrix_out(:,2) < (0.1*K)+0.1), :);
tot_percent_matrix_out_new = sum(matrix_out_new(:,2));
matrix_separation = [matrix_separation; {matrix_out_new},tot_percent_matrix_out_new];
end
matrix_separation = [matrix_separation_0 ; matrix_separation]
In the code, however, I noticed that the value 423|1.2 is found both in the penultimate and in the last cell inside 'matrix_separation'.
The value 423|1.2 should only appear in the last cell given the range >=1.2 & <1.3! Thanks to whoever solves this doubt...

0 Comments
Accepted Answer
Dyuman Joshi
on 20 Sep 2023
Edited: Dyuman Joshi
on 20 Sep 2023
load matrix_out
%Mention the bins to group data in
j = [0 0.1:0.1:1.2 Inf];
%Discretize the data
idx = discretize(matrix_out(:,2),j);
%Split the array according to the groups
out1 = splitapply(@(x) {x}, matrix_out, idx)
You can see above that the 2nd last group is 13x2 instead of 14x2. The sum obtained will be modified accordingly as well.
%Get the sum of the 2nd column according to the groups
out2 = splitapply(@(x) sum(x), matrix_out(:,2), idx)
%Concatenate to get the final output
out = [out1 num2cell(out2)]
3 Comments
Dyuman Joshi
on 22 Sep 2023
What you are seeing is the limitation of floating point numbers.
load matrix_out
%Mention the bins to group data in
j = [0 0.1:0.1:1.2 Inf];
%% Let's see what the data is stored as
%First the matrix values
%Displayed value
disp(matrix_out(10:15,2))
%Stored value
fprintf('%0.42f\n',matrix_out(10:15,2))
%Now the values of the groups
%Displayed value
disp(j')
%Stored values
fprintf('%0.42f\n',j)
You can see that the values are not exactly 0.1, 0.2, 0.3 etc. The only values that are stored exactly as their decimal representation are the powers of 2 (0.5 = 2^-1, 1 = 2^0).
This means there will be some errors while working with floating point numbers.
So, what to do now? There is a workaround - Scale up the data to integers and operate.
As the data in the 2nd column of the matrix_out have values upto the 2nd digit after the decimal, so scale up by a factor of 10^2.
%Scale up by a factor of 100
%Scaling the data
vec = floor(matrix_out(:,2)*100);
%Scaling the bins
j = [0 10:10:120 Inf];
%Discretize the data according to the scaled values
idx = discretize(vec,j);
%Split the array according to the groups
out = splitapply(@(x) {x}, matrix_out, idx)
disp(out{3,1})
More Answers (0)
See Also
Categories
Find more on Text Data Preparation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!