Calculate confidence interval of data set

Question

Katara So on 17 Dec 2020

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/696780-calculate-confidence-interval-of-data-set

Edited: Ive J on 20 Dec 2020

I want to calculate a 95% confidence interval for the difference between the mean of each variable for the species versicolor and virginica in the "fisheriris" data set, but I don't quite understand how to write the code. This is my code so far:

load fisheriris
%Versicolor
ver1 = meas(51:100,1);
ver2 = meas(51:100,2);
ver3 = meas(51:100,3);
ver4 = meas(51:100,4);
ver1_mean = mean(ver1);
ver2_mean = mean(ver2);
ver3_mean = mean(ver3);
ver4_mean = mean(ver4);
%Virginica
vir1 = meas(101:150,1);
vir2 = meas(101:150,2);
vir3 = meas(101:150,3);
vir4 = meas(101:150,4);
vir1_mean = mean(vir1);
vir2_mean = mean(vir2);
vir3_mean = mean(vir3);
vir4_mean = mean(vir4);
q1 = quantile(ver1_mean - vir1_mean, [0.95])
q2 = quantile(ver2_mean - vir2_mean, [0.95])
q3 = quantile(ver3_mean - vir3_mean, [0.95])
q4 = quantile(ver4_mean - vir4_mean, [0.95])

But I assume that I should have some sort of interval instead of just 0.95 in my quantile command, however whatever I put for ex, [0.05 0.95], I get the same value in for both ends of the interval and if I just have 0.95 then I don't get an interval. Could someone explain where I have gone wrong, all help is appreciated!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Ive J on 18 Dec 2020

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/696780-calculate-confidence-interval-of-data-set#answer_579000

Edited: Ive J on 18 Dec 2020

Open in MATLAB Online

I'm not sure if you want to calculate CI or something like reference range. Regardless, as you can see on quantile help page, it says Quantiles of a data set. So, of course when your inputs are scalar the output is same as the input.

data = randn(100, 1);
quantile(data, [0.05 0.95]) % this is not 95% CI
% 95% CI
CI95 = [mean(A) - 1.96*(std(A)./sqrt(numel(A))), mean(A) + 1.96*(std(A)./sqrt(numel(A)))]

2 Comments
Show NoneHide None

Katara So on 18 Dec 2020

Open in MATLAB Online

I want to calculate CI, however I am having trouble getting the mean values of the variables. In another problem where I had to use bootstrp to calculate CI, I did like this:

%Versicolor
ver1 = meas(51:100,1);
%Virginica
vir1 = meas(101:150,1);
M = 2000;
alpha = 0.05/4;
vers_boot1=bootstrp(M,@mean,ver1);
vir_boot1=bootstrp(M,@mean,vir1);
ci1 = quantile(vers_boot1-vir_boot1, [alpha 1-alpha])

which gave me an that ci1 = [-0.9170 -0.3960].

So now I want to do the same but without using bootstrp and for 95% CI. What I don't understand is how to get a vector of the mean values for the two different variables. With the code in my original question I get that the mean is only one value while in the problem posted above I get it to be a vector.

Ive J on 20 Dec 2020

Edited: Ive J on 20 Dec 2020

Open in MATLAB Online

In bootsrapt you simply do random sampling form ver1 and vir1; so, in above example you pick K random subsamples with replication from ver1 and vir1 and calculate the mean of each subsample. So, you have a vector of mean values, for which you can easily calculate CI.

I guess you are looking for something like this:

load fisheriris
%Versicolor
ver = meas(51:100,1);
%Virginica
vir = meas(101:150,1);
% 95% CI using bootstrapping
M = 2000;
alpha = 0.05/2;
ver_boot = bootstrp(M, @mean, ver);
vir_boot = bootstrp(M, @mean, vir);
ci_bootstrap = quantile(ver_boot - vir_boot, [alpha 1-alpha]);
% using Z-value (assuming normal distribution)
A = ver - vir;
ci_z = [mean(A) - 1.96*(std(A)./sqrt(numel(A))), ...
    mean(A) + 1.96*(std(A)./sqrt(numel(A)))];
% using t-distribution 
[~, ~, ci_t] = ttest2(ver, vir);
% compare them
ci_bootstrap
ci_z
ci_t'
ci_bootstrap =
   -0.8880   -0.4230
ci_z =
   -0.8942   -0.4098
ci_t =
   -0.8819   -0.4221
% visualize esitmated mean from bootstrapping
histogram(ver_boot - vir_boot)
line([mean(A), mean(A)], get(gca, 'YLim'), 'color', 'r', 'LineWidth', 1.5)

Sign in to comment.

Calculate confidence interval of data set

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

2 Comments
Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

Calculate confidence interval of data set

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

2 Comments Show NoneHide None

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None