Why I am getting Same p-value, h and stats while using Wilcoxon Rank test for 7 different data set?

22 views (last 30 days)
I am trying to find the significant realtion between the az_solar and Mean_Surface_REF of seven different bands using Wilcoxon rank test. But I am geeting same p-value and statistics for every bands. However, each band contains different value.
Would some one please help me regrading these issue?
%% This is the code i am using
[p1,h1,stats1] = ranksum(az_solar,Mean_Surface_REF(:,1));
[p2,h2,stats2] = ranksum(az_solar,Mean_Surface_REF(:,2));
[p3,h3,stats3] = ranksum(az_solar,Mean_Surface_REF(:,3));
[p4,h4,stats4] = ranksum(az_solar,Mean_Surface_REF(:,4));
[p5,h5,stats5] = ranksum(az_solar,Mean_Surface_REF(:,5));
[p6,h6,stats6] = ranksum(az_solar,Mean_Surface_REF(:,5));
[p7,h7,stats7] = ranksum(az_solar,Mean_Surface_REF(:,7));
Results:
8.1275311e-44
8.1275311e-44
8.1275311e-44
8.1275311e-44
8.1275311e-44
8.1275311e-44
8.1275311e-44
Also, I have attached my dataset.

Accepted Answer

dpb
dpb on 13 Jun 2022
taz=readmatrix('az_solar.xlsx').';
>> [mean(taz) std(taz)]
ans =
138.9109 16.7621
>>
>> mnref=readmatrix('Mean_Surface_REF.xlsx');
>> [mean(mnref); std(mnref)]
ans =
0.0030 0.0065 0.0166 0.0093 0.0013 0.0045 0.0049
0.0034 0.0023 0.0043 0.0034 0.0039 0.0053 0.0052
>>
pretty-much explains it -- the means are some 4-5 orders of magnitude between the one vector and any of the vectors in the array. You've just totally saturated the value of the test statistic to the point the p-value returned is essentially zero.
You'd have been able to determine this yourself if you had simply looked at your data before blindly throwing it at some test statistic.
>> [p,h,stats]=ranksum(taz,mnref(:,1))
p =
8.0633e-44
h =
logical
1
stats =
struct with fields:
zval: 13.8827
ranksum: 25026
>> format long, format compact
>> 1-normcdf(stats.zval)
ans =
0
>> normcdf(stats.zval)
ans =
1
>> normcdf(stats.zval)
ans =
1
>>

More Answers (1)

the cyclist
the cyclist on 13 Jun 2022
Edited: the cyclist on 13 Jun 2022
That test statistic only depends on the count and relative ordering of the respective set elements. In your case, it looks like the counts are always the same, and (as @dpb points out), your 7 cases are always completely offset from the comparator set.
Here is a small example showing the same thing:
x = 1:5;
y1 = 101:105; % 5 elements, fully offset from x
y2 = 1001:1005; % Also 5 elements, fully offset from x [same result]
y3 = 101:103; % 3 elements, still fully offset [different result due to different count]
ranksum(x,y1)
ans = 0.0079
ranksum(x,y2)
ans = 0.0079
ranksum(x,y3)
ans = 0.0357

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!