Different confidence intervals for regression slope

Question

Brian Scannell on 28 Apr 2017

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/337854-different-confidence-intervals-for-regression-slope

Commented: Star Strider on 28 Apr 2017

Can anyone explain why I am getting different answers for the confidence limits for the slope of a linear regression when I use polyfit and polyparci compared with using fitlm and coefCI. For example the following code generates some linearly correlated data with added noise, then does the least squares fit directly, using polyfit and using fitlm, extracting the key items of data at each step:

clear variables
x = (0:10)';
Y = 3.5*x + (((rand(size(x))-0.5)/3).*x);
% option 1
X = [ones(size(Y)), x];
B1 = X\Y;
Ycalc = X*B1;
R21 = 1 - sum((Y - Ycalc).^2)/sum((Y - mean(Y)).^2);
R2a1 = 1 - ((1-R21)*(length(Y)-1)/(length(Y)-length(B1)));
clear X Ycalc
% option 2
[p,S] = polyfit(x,Y,1);
B2 = fliplr(p)';
coef = corrcoef(x,Y);
R22 = coef(1,2)^2;
R2a2 = 1 - ((1-R22)*(length(Y)-1)/(length(Y)-length(B2)));
ci2 = polyparci(p,S,0.95);
clear p S coef
% option 3
mdl = fitlm(x,Y,'y ~ x1');
B3 = mdl.Coefficients{:,1};
R23 = mdl.Rsquared.Ordinary;
R2a3 = mdl.Rsquared.Adjusted;
ci3 = coefCI(mdl,0.05);
ci3 = fliplr(ci3');
clear mdl

As one would expect, all of the approaches produce the same regression coefficients, R-squared and adjusted R-squared values. However, the confidence intervals generated by polyparci and coefCI are different. In all cases I have tried, the range of the confidence limits returned by coefCI is wider than that from polyparci.

Can anyone explain why the methods produce different results?

Thanks, Brian

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Brian Scannell on 28 Apr 2017

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/337854-different-confidence-intervals-for-regression-slope#answer_264953

Ah, I think I've resolved it. There appears to be a difference in the way that the confidence interval alpha is interpreted. Calling polyparci(p,S,0.95) and coefCI(mdl,0.1) give the same answers.

I'm still not sure which set of limits are most appropriately described as the "95% confidence intervals" though - any views?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 2

Star Strider on 28 Apr 2017

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/337854-different-confidence-intervals-for-regression-slope#answer_264971

Open in MATLAB Online

I originally tested polyparci only with nlparci, and the estimates then were essentially the same. I posted it before fitlm appeared.

Change the ‘tstat’ assignment in polyparci to:

tstat = @(tval) (max(alpha,(1-alpha)) - t_cdf(tval,PolyS.df) );    % Function to calculate t-statistic for p = ‘alpha’ and v = ‘PolyS.df’

and the results are identical with nlparci, fitlm and regress.

Thank you for discovering this glitch with the ‘alpha’ argument. I’ll update polyparci and post it.

2 Comments
Show NoneHide None

Brian Scannell on 28 Apr 2017

I am less confused by the alpha versus 1 - alpha issue than by the fact that to get matching results I have to specify 0.95 in polyparci and 0.1 (effectively 0.9) in coefCI.

I am interpreting the results from polyparci as being "there is a 95% probability that the "true" gradient is less than the calculated upper limit". Similarly, "there is a 95% probability that the "true" gradient is more than the lower limit". Taken together, it means there is a 10% chance that the "true" gradient is outside the bounds defined by the upper and lower limits.

So if the alpha input to coefCI is for the probability of the "true" gradient being outside the returned limits, then the factor two difference in the alpha value for the two functions makes sense.

But is this a correct interpretation of the outputs from the two functions?

Is this a distinction between "confidence limits" and "confidence interval"?

Thanks for your help.

Star Strider on 28 Apr 2017

My pleasure.

With the correction I posted, there is no ambiguity, and the confidence interval will be the same.

My impression is that the confidence interval calculation in nlparci changed between the time I wrote the function and now. I changed my function to accord with the current behavior of the MATLAB Statistics and Machine Learning Toolbox functions.

‘Taken together, it means there is a 10% chance that the "true" gradient is outside the bounds defined by the upper and lower limits.’

That is incorrect, at least as I read it. The confidence intervals are such that at a 95% (or 5%) confidence interval, there is a 95% probability that the true value is within those limits and a 5% (or ±2.5%) probability that they will lie outside those limits.

The terms ‘confidence limits’ and ‘confidence interval’ are essentially the same. The context must be clear if either term is used. I prefer the term ‘confidence limits’.

Sign in to comment.

Different confidence intervals for regression slope

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

Different confidence intervals for regression slope

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

0 Comments Show -2 older commentsHide -2 older comments

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None