Change significance value in ttest2

33 views (last 30 days)
Katara So
Katara So on 2 Jan 2021
Edited: Ive J on 2 Jan 2021
I want to do two ttest2 tests; one where the significance value is 5% and one where it is 0.0125% for the difference in mean value of the species Versicolor and Virginica (of the data set fisheriris) four different variables.
This was my code for the default significance value:
load fisheriris
%Versicolor
ver1 = meas(51:100,1);
ver2 = meas(51:100,2);
ver3 = meas(51:100,3);
ver4 = meas(51:100,4);
%Virginica
vir1 = meas(101:150,1);
vir2 = meas(101:150,2);
vir3 = meas(101:150,3);
vir4 = meas(101:150,4);
% ttest2
[h1, p1, ci1] = ttest2(ver1, vir1, 'Vartype', 'unequal')
[h2, p2, ci2] = ttest2(ver2, vir2, 'Vartype', 'unequal')
[h3, p3, ci3] = ttest2(ver3, vir3, 'Vartype', 'unequal')
[h4, p4, ci4] = ttest2(ver4, vir4, 'Vartype', 'unequal')
And then for the significance value 0.0125% I wrote:
[h11, p11, ci11] = ttest2(ver1, vir1, 'Vartype', 'unequal', 'alpha', 0.05/4)
[h22, p22, ci22] = ttest2(ver2, vir2, 'Vartype', 'unequal','alpha', 0.05/4)
[h33, p33, ci33] = ttest2(ver3, vir3, 'Vartype', 'unequal','alpha', 0.05/4)
[h44, p44, ci44] = ttest2(ver4, vir4, 'Vartype', 'unequal','alpha', 0.05/4)
Firstly, does this give me the mean difference between the variables? Secondly, the two tests give the same h and p values and only the confidence interval differs, which i don't believe to be correct. I found a question asked here saying that trying ranksum should give the same p value as ttest2 and then the result could be considered correct. So I tried:
[rp1, rh1, rci1] = ranksum(ver1, vir1, 'alpha', 0.05/4)
which gives me a completely different value for p.
Could someone explain where I am going wrong? Thank you!

Answers (1)

Ive J
Ive J on 2 Jan 2021
I found a question asked here saying that trying ranksum should give the same p value as ttest2 ...
No, it doesn't say that. I just says Wilcoxon is a distribution independent (non-parametric) test. Non-parametric tests don't make any assumption of underlying distribution. This does not mean that their test statistics and hence, their p-values are the same.
For the first part of your question, of course the p-value would be same irrespective of alpha level you chose. P-value is calculated from the test statistic which remains constant simply becasue your variables on interest are the same! The only difference is confidence interval which relies on the alpha level you choose. When alpha is 0.05, you get a 95% CI, and when is 0.0125 it's (1-0.0125 =) 98.75% CI.
You already asked a question but did not follow it here on CI.
  2 Comments
Katara So
Katara So on 2 Jan 2021
Sorry, I misunderstood it then.
From the question I asked early I wrote the code:
% Difference between the two
diff1 = ver1 - vir1;
diff2 = ver2 - vir2;
diff3 = ver3 - vir3;
diff4 = ver4 - vir4;
n = length(diff1);
mean1 = mean(diff1);
mean2 = mean(diff2);
mean3 = mean(diff3);
mean4 = mean(diff4);
sdd1 = sqrt(var(diff1));
sdd2 = sqrt(var(diff2));
sdd3 = sqrt(var(diff3));
sdd4 = sqrt(var(diff4));
tstat1 = mean1/(sdd1/sqrt(n));
tstat2 = mean2/(sdd2/sqrt(n));
tstat3 = mean3/(sdd3/sqrt(n));
tstat4 = mean4/(sdd4/sqrt(n));
cript = tinv(0.95, n-1);
alpha = 0.05;
ci1 = [mean1 - tinv(1-alpha/2, n-1) * (sdd1/sqrt(n)), mean1 + tinv(1-alpha/2, n-1)*(sdd1/sqrt(n))]
ci2 = [mean2 - tinv(1-alpha/2, n-1) * (sdd2/sqrt(n)), mean2 + tinv(1-alpha/2, n-1)*(sdd2/sqrt(n))]
ci3 = [mean3 - tinv(1-alpha/2, n-1) * (sdd3/sqrt(n)), mean3 + tinv(1-alpha/2, n-1)*(sdd3/sqrt(n))]
ci4 = [mean4 - tinv(1-alpha/2, n-1) * (sdd4/sqrt(n)), mean4 + tinv(1-alpha/2, n-1)*(sdd4/sqrt(n))]
added between the data for the species and the ttest. However, I realized that ttest2 already gives me the CI, so I removed it. Which is why I found the original question asked before hard to follow and thought it better to ask a new question. Sorry, if that was wrong of me.
So would it be correct to do as I have done in the code stated in the original question above? I specifically want to "(a) Investigate the hypothesis that the difference in mean value for the variables differ from the null hypothesis (alpha = 5%). This corresponds to a double sided t-test, use the function ttest2.
(b) change alpha to 0.0125% and find the confidence interval". What I am uncertain of is if the way I did it gives me the mean difference or if I have to calculate something before performing ttest2.
Sorry for repeating myself so much but I am new to Matlab and we aren't getting much help with the coding and I am really struggling to understand how to apply things I find on MathWorks to my problems.
Thank you once again for taking the time!
Ive J
Ive J on 2 Jan 2021
Edited: Ive J on 2 Jan 2021
In this case you can use ttest2 given that your variables satisfy t-test assumptions (e.g. normal variables or fairly large sample size). Please note that h tells you if you can reject your H0 depending on the alpha level you've chosen. In your example, if p-values are below both 0.05 and 0.05/4, so you can confidently reject the null (that's why h of two tests is same) that two populations have equal means .
Hypothetically speaking (and clearly), if p-value was 0.03, you would only reject the null of first test (alpha = 0.05) and not the latter (alpha = 0.05/4).
But again, note that this has nothing to do with test statistic itself (tstat1 in your other snippet):
tstat = mean(X) / (std(X)/sqrt(numel(X)));

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!