What should my parameter be for the chi2pdf?

5 views (last 30 days)
mabdorab
mabdorab on 17 Dec 2016
Commented: mabdorab on 18 Dec 2016
I created a histogram for my data which is scaled so i can compare a distribution to it. I compared few distributions and wanted to also compare the chi squared distribution.
I used chi2pdf(100000:1:240000,?). What should be in place of the '?' here. Would i use the mean, as the degrees of freedom represents the expected value if i am not mistaken.
My data is of different number of births for 21 different years, i.e from 1994-2014.
Also my histogram is positively skewed and any ideas on other distributions i can use would be appreciated. I have used the poisson, normal, chi squared and exponential curve, however they have not realy represented my data well.
Thanks in advance.

Answers (2)

Star Strider
Star Strider on 17 Dec 2016
the chi2pdf requires only two parameters, the values you want it calculated, and the degrees of freedom. So you would put the degrees-of-freedom where the ‘?’ is. I invite you to read about the chi-squared distribution to learn what it describes and how best to use it.
Most biomedical data are defined by a lognormal distribution. They are close to normally distributed, but only for positive values of the relevant variables for obvious reasons (unless you’re studying differences). The chi-squared distribution adheres to that criterion, but is chiefly used to compare distributions or patterns in data.
If you are looking at numbers of births in different years, I would plot your data first. Do a Fourier transform (see the documeatation for fft (link)) to see the distribution over time. This may be more informative than a distribution, since most births in humans occur in the late summer.
I would also consult PubMed to see how others have approached this problem.
  2 Comments
mabdorab
mabdorab on 17 Dec 2016
Hi Star Strider,
I am doing a comparison. I have been given births on 2 different UK regions where the data is given from 1994-2014. The data is split based on mother age. i.e a 21 x 6 contingency table.
I plotted the histograms and I am trying to find what distribution may represent the data well as it has not been given in the question. I need to analyse my data using statistical techniques. Now Fourier is not required and I have used Chi-squared distribution in multiple different ways all with different degrees of freedoms depending on the number of estimated parameters. How the data is a sample, hence ,mean and variance are samples. Hope that makes my analysis a little clearer.
Would my degrees of freedom be 20 in this case?
Star Strider
Star Strider on 18 Dec 2016
My guess is that since you’re comparing two different regions, (that is the number of classes k are 2), the degrees of freedom would be k-1 or 1 (or so I gather from just looking this up in Snedecor and Cochran, Statistical Methods, Eighth Edition).

Sign in to comment.


John BG
John BG on 18 Dec 2016
Edited: John BG on 18 Dec 2016
the parameter called 'degree of freedom' of a chi-square probability distribution is not the mean value but half the variance.
the variance is the second order statistical moment that under certain circumstances is the square of the standard deviation.
If you want we can review these basics with further detail, or I can give you precise literature references.
John BG
  2 Comments
Star Strider
Star Strider on 18 Dec 2016
That is so wrong it is not worthy of correction.
mabdorab
mabdorab on 18 Dec 2016
i have never heard the degrees of freedom to be half of variance. I am currently studying this statistical course and don't think i have come across this before. I have to agree with Star Strider here.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!