Edges and Observed values of Chi2gof

I am trying to do a chi squared test on the means of this data which I have already sorted and is not the issue here. However, my observed values are just 4 ones even though I inputted the means I calculated. Does anyone know why this happens and how to solve it? Your help would be appreciated.
a=readmatrix("DIP Project- Population.xlsx");
population=a(:,1);
production=a(:,7);
gr1gen36=a(61:90,7);
gr1gen36(isnan(gr1gen36))=0;
gr1gen36;
m1=mean(gr1gen36);
gr2gen36=a(151:180,7);
gr2gen36(isnan(gr2gen36))=0;
gr2gen36;
m2=mean(gr2gen36);
gr3gen36=a(241:270,7);
gr3gen36(isnan(gr3gen36))=0;
m3=mean(gr3gen36);
gr4gen36=a(331:360,7);
gr4gen36(isnan(gr4gen36))=0;
gr4gen36;
m4=mean(gr4gen36);
gen36=[m1,m2,m3,m4];
expected=mean(gen36);
ex=[expected,expected,expected,expected];
[h,p,tbl]=chi2gof(gen36,'Expected',ex,'Alpha',0.05)
tbl =
struct with fields:
chi2stat: 66.0168
df: 3
edges: [9.3667 13.5667 17.7667 21.9667 26.1667]
O: [1 1 1 1]
E: [18.4500 18.4500 18.4500 18.4500]

6 Comments

Extra note: The mean values I am using are all currently in the edges field. The value of 13.5667 is not one of the means I was using.
Can you upload the data? You can use the paper clip icon in the INSERT section of the toolbar.
Nathan
Nathan on 26 Dec 2025
Edited: Nathan on 26 Dec 2025
I know the data can't be the issue as I already checked it and it is all just numbers and if the data was wrong the chi2gof wouldn't work to begin with. I then took means of the columns of data and that is what I put into the chi2gof.
I'm not saying the data is the problem. I'm trying to save myself (and everyone else here) the trouble of creating a test dataset to explore your problem. Posting a self-contained piece of data and code that exhibits your problem is always the best way to get help here. Make it as easy as possible for someont to help you.
Edit: Are you saying that the input to chi2gof() is
gen36 = [9.3667 17.7667 21.9667 26.1667]
?
That was not clear to me from your explanation, but now maybe I think that's what you meant. If so, then I understand why the data of how you got to gen36 are not important.
Edit #2: I can't figure out an input to chi2gof() that replicates your result.
Nathan
Nathan on 26 Dec 2025
Edited: Nathan on 26 Dec 2025
The input to chi2gof is [21.7000 26.1667 16.5667 9.3667].
Edit: If chi2gof() won't work I can always manually calculate the chi-squared value like so:
OE=(gen36-expected).^2/expected
sum(OE)
chi2inv(0.05,3)
Your help is greatly appreciated :)
I have solved my issue by doing the chi-squared goodness of fit test manually to get a more reasonable value. Your help has been greatly appreciated and I thank you very much :)

Sign in to comment.

 Accepted Answer

the cyclist
the cyclist on 26 Dec 2025
Edited: the cyclist on 26 Dec 2025
Here's what is happening. chi2gof() is expecting the raw, observed data. It is not expecting you to have precalculated the means. Therefore, what is it doing with your inputs?
It thinks that all of your observed data points are
x = [21.7000 26.1667 16.5667 9.3667]; % Total of four data points, to be binned
You then effectively tell it that you have four bins, because that is the length of the vector ex.
So, what does chi2gof do with this information? It puts one value of x into each of the four bins. This is why tbl.O = [1 1 1 1]. And then it (correctly) calculates the chi^2 stat, based on one observation in each bin, when it was told to expect 18.45 observations per bin.
You should have fed chi2gof() the raw counts.

12 Comments

I have inputted the raw values but it says the following error:
Error using chi2gof (line 129)
X must be a vector of real values.
Error in
CSgen36 (line 22)
[h,p,tbl]=chi2gof(gen36,'Expected',ex,'Alpha',0.05)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is the raw count:
gen36 =
31 27 34 3
0 42 18 12
28 39 23 10
30 8 35 26
26 33 32 12
7 45 14 18
43 37 8 0
30 28 31 0
17 28 2 15
6 41 12 0
25 23 19 42
27 28 23 0
34 31 22 0
6 29 16 0
3 24 38 27
29 35 7 5
23 13 11 0
0 20 0 23
12 25 32 25
1 30 29 3
30 32 0 0
25 31 0 30
6 32 0 0
31 11 2 0
27 8 53 0
What else can I do?
What do the rows and columns represent? chi2gof is expecting one vector of values, and is applying a statistical test to determine whether x comes from a specified distribution. You seem to want something else, and I'm not confident you are testing what you think you are.
I am testing by each column being a separate group to see if a set of groups are significantly different from the average. Is that not what the chi squared test is for or should i be using another statistical test?
It sounds to me like you want to do an "analysis of variance" (ANOVA) test, which you can do with anova1.
But I would strongly recommend you read the documentation to determine if that seems right to you, and not just take my word for it.
gen36 = [ ...
31 27 34 3
0 42 18 12
28 39 23 10
30 8 35 26
26 33 32 12
7 45 14 18
43 37 8 0
30 28 31 0
17 28 2 15
6 41 12 0
25 23 19 42
27 28 23 0
34 31 22 0
6 29 16 0
3 24 38 27
29 35 7 5
23 13 11 0
0 20 0 23
12 25 32 25
1 30 29 3
30 32 0 0
25 31 0 30
6 32 0 0
31 11 2 0
27 8 53 0];
[p,tbl,stats]=anova1(gen36)
p = 3.9157e-05
tbl = 4×6 cell array
{'Source' } {'SS' } {'df'} {'MS' } {'F' } {'Prob>F' } {'Columns'} {[4.0584e+03]} {[ 3]} {[1.3528e+03]} {[ 8.6408]} {[3.9157e-05]} {'Error' } {[1.5030e+04]} {[96]} {[ 156.5600]} {0×0 double} {0×0 double } {'Total' } {[1.9088e+04]} {[99]} {0×0 double } {0×0 double} {0×0 double }
stats = struct with fields:
gnames: [4×1 char] n: [25 25 25 25] source: 'anova1' means: [19.8800 28 18.4400 10.0400] df: 96 s: 12.5124
If this seems right, then you can also run
multcompare(stats)
afterward, to compare groups.
I have already done an ANOVA on this data but your help has been appreciated :)
One more small question: Would doing the chi squared manually work as well? (As in using the code to calculate the value instead of the chi sqaured command)
It is unclear to me how you are trying to interpret the chi^2 calculation. What do you think it means? Why are you trying to go beyond the ANOVA result, which seems to answer exactly what you are asking?
Well the chi^2 calculation should back up my ANOVA result no? Providing further evidence that the data is different from average, showing significant change. Is that a wrong thing to do?
I understand better now, and I suddenly realized something. You have been using a chi-square goodness of fit function, but what you actual want to do is a chi-square test of independence. Those are two different things!
I would not say that do the second test is "further evidence". At best, the two tests should simply be consistent with each other.
I'm not sure if what you have calculated manually does that test, but maybe. I'd probably need to understand your data better. I don't think you ever stated what the rows represent. Is each row an independent test subject? Are the values counts? Etc.
Nathan
Nathan on 26 Dec 2025
Edited: Nathan on 27 Dec 2025
Values are counts and each column is a different group. Each value is an individual.
Edit: How would I do a chi squared test of independence on matlab since there is no command to do so?
Edit 2: The ANOVA and chi2gof test do match up :)
Well, a complication is that the test is for categorical or nominal variables, and yours is continuous. So in some ways it is just not the right test. You'll could bin the data to put them into a contingency table, and then it looks like you can use crosstab to report on the test. Maybe an AI can help you write the code. Depending on how critical the result is, you might want solicit someone with greater expertise, if you can. I don't want you to rely on my quick thoughts on a Friday evening.
That being said, a bigger issue is that the appropriate research question and statistical tests should really be determined before the results are seen. I think ANOVA (or perhaps kruskalwallis), by itself, is likely the best test, full stop.
@Nathan, I think @the cyclist is providing you with excellent advice: do the ANOVA or kruksal-wallis, then stop. And 100% for "the appropriate research question and statistical tests should really be determined before the results are seen".

Sign in to comment.

More Answers (0)

Products

Release

R2025a

Tags

Asked:

on 26 Dec 2025

Commented:

on 28 Dec 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!