how to use betafit for a data that is outside the range of 0 to 1.?

28 views (last 30 days)
I have numbers between 10 to 20. If I try to use betafit it refuses to fit because the range is not from 0 to 1. If I have numbers from say .12 to .25 it lets me fit a beta. I am not sure why range has to limit my ability to fit a beta distribution to a dataset. Help please.
  1 Comment
Pappu Murthy
Pappu Murthy on 12 Jan 2022
Edited: Pappu Murthy on 12 Jan 2022
I would like to elaborate a bit. I have one data set where the numbers vary from a min 0.991 and max of 1.441. If I try to "betafit" to Data as it is, then it gives an error saying my data can not lie outside of 0-1 interval. So I used the command rescale(Data) which scaled the Data between 0 and 1. Now I beta fit and got the following picture which is clearly wrong. Picture Attached. What am I doing wrong here?

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 12 Jan 2022
Edited: John D'Errico on 12 Jan 2022
You essentially have a 4 parameter beta distribution that you wish to fit. That is, the upper and lower limits can be part of it. But betafit only does the 2-parameter one, because estimating those bounds can be difficult.
Why do I say the bounds are difficult to estimate automatically? Because where the distrbution goes to zero at the end, exactly where does it become exactly zero? And on distributions where the PDF goes to infinity, now you have a singularity, and trying to figure out exactly where the singularity lies is again a nasty thing.
But you can simply shift and scale your data to [0,1], or [-1,1], depending on the beta parameterization. For example...
x = randn(1,10000)/10 + 3;
histogram(x,'norm','pdf')
Yes, we all know this is normally distributed data. SHSSH! Its a secret. Now, suppose we want to fit a beta distribution to it? Now it lives in the interval [0,1]. Again, exactly where those limits really should be is very difficult to quantify.
xscale = (x - mean(x));
xscale = xscale/max([abs(min(xscale)),max(xscale)])/2 + 1/2;
histogram(xscale,'norm','pdf')
PHAT = betafit(xscale)
PHAT = 1×2
5.6668 5.6614
hold on
fplot(@(X) betapdf(X,PHAT(1),PHAT(2)),[0 1])
That is not a terrible fit, but a normal would surely have done better.
Finally, what did you do wrong in the pickture you showed? That data does not look very much like any beta distribution I can think of, with that large spike in the center, but flat tails. I'm not remotely surprised you got crap for a fit. And it does not look like you have much data. Betafit had problems fitting a beta to that data. Again, absolutely no surprise.
  1 Comment
Pappu Murthy
Pappu Murthy on 17 Jan 2022
Thanks for the help. I didn't realize that it can get tricky but your answer really clarified everything.

Sign in to comment.

More Answers (0)

Products


Release

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!