102 views (last 30 days)

Hello:

The image below is a histogram of a large data set (90*1 double in blue) and a single data point (in red). I would like to compute the probability of the data (in red) against the blue data points. I could counts the counts on the left of the red bar and divide it by the total counts (90). But I want a matlab code that will do it more efficiently and in a faster way probably without even using the histogram. Thank you.

Steven Lord
on 8 May 2018

Change the Normalization property of the histogram object then get the appropriate element of the Values property of that object.

rng default

x = randn(10000,1);

h = histogram(x)

h.Values(10)

Since the default Normalization method is 'count', this will tell you that there are 133 elements of x that fall into bin 10. [Since I used rng default, you should get the exact same random numbers in x as I did and so generate the exact same histogram.]

h.Normalization = 'probability';

h.Values(10)

Now h.Values(10) is 0.0133 which makes sense: 133 / 10000 (the total number of points) = 0.0133.

If you wanted to get the same information without actually bringing up the plot, the histcounts function also lets you specify a 'Normalization' method.

And I'd guess that histogram you showed was created with something more like 900 data points than 90. According to the Y limits each of the 5 central bars contain more than 90 elements, assuming you're using the default 'count' Normalization. Still not Big Data, but bigger.

Image Analyst
on 7 May 2018

You need to know the edges of the bin, e1 and e2. Then you can simply do

percentageInBin = sum(data>=e1 & data < e2) / numel(data);

No histogram needed if you just need it for that one red bin.

By the way, it made me snicker when you described 90 elements as large. It literally would have to be around a million times that big before anyone might start considering it large.

Image Analyst
on 8 May 2018

Just the bar in red.

To do it without explicitly computing a histogram array, you'd have to do it one bin at a time. Much better to simply get the histogram and divide the counts array by the total counts. Why can't you compute the histogram?

Opportunities for recent engineering grads.

Apply TodayFind the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
## 0 Comments

Sign in to comment.