How to merge very close bins in histogram/hiscounts

6 views (last 30 days)
I have such a histogram ( saved as the *.mat file and attached), and I would like to "shapren" it by merging counts from very close bins (circled) so eventually there should be multiple isolated but taller single bins. I can think of the method by setting threshold of "very close bins", calculating bin-bin distance, grouping neighbour bins, and merging same-group bins together. But is there any more flexible/dynamic way to do it without setting a fixed threshold?

Accepted Answer

Jan
Jan on 15 Nov 2022
Edited: Jan on 15 Nov 2022
You can use FileExchange: RunLength to find the longest blocks of missing data:
[b, n] = RunLength(x);
match = (b == 0 & n > 5);
Lim = b(match) + n(match) / 2;
Now the limits are the centers of the blocks with a width of at least 5. Use this as bins for histcounts.
Maybe you have to care about the margins, if the data do not start with empty blocks.
If you do not have a C-compiler, use RunLength_M from the same submission.

More Answers (1)

Steven Lord
Steven Lord on 15 Nov 2022
Once you've binned your data once, use whatever mechanism you want to determine which bin edges separate bins you want to merge. After you've removed those interior edges from the list of edges, set the histogram's BinEdges property to the pruned list and it will rebin the data.
x = randn(1, 1e5);
h1 = histogram(x, 12); % 12 bins
Let's delete every 4th bin edge. In order for both histograms to show up in this Answers post I need to create a new histogram, but if you wanted to update the existing one you'd use the commented out command.
E = h1.BinEdges;
E(4:4:end) = [];
figure
h2 = histogram(x, E); % or
% h1.BinEdges = E;
Let's check the bin counts and bin edges.
E1 = h1.BinEdges.';
result1 = table(h1.BinCounts.', [E1(1:end-1), E1(2:end)], 'VariableNames', ["Counts", "Bins"])
result1 = 12×2 table
Counts Bins ______ ______________ 26 -4.2 -3.48 272 -3.48 -2.76 1759 -2.76 -2.04 7220 -2.04 -1.32 18163 -1.32 -0.6 27202 -0.6 0.12 25265 0.12 0.84 14171 0.84 1.56 4774 1.56 2.28 1008 2.28 3 128 3 3.72 12 3.72 4.44
E2 = h2.BinEdges.';
result2 = table(h2.BinCounts.', [E2(1:end-1), E2(2:end)], 'VariableNames', ["Counts", "Bins"])
result2 = 9×2 table
Counts Bins ______ ______________ 26 -4.2 -3.48 272 -3.48 -2.76 8979 -2.76 -1.32 18163 -1.32 -0.6 27202 -0.6 0.12 39436 0.12 1.56 4774 1.56 2.28 1008 2.28 3 140 3 4.44

Categories

Find more on Data Distribution Plots in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!