MATLAB Answers

Tania
0

Multiple linear model p value f test t test

Asked by Tania
on 21 Jul 2014
Latest activity Commented on by Tania
on 22 Jul 2014
Hi! I am a bit confused by the matlab documentation: Linear regression model: y ~ 1 + x1 + x2 + x3
*pValue*
Intercept 4.8957e-21
x1 9.8742e-08
x2 0.08078
x3 0.95236
Number of observations: 93, Error degrees of freedom: 89
Root Mean Squared Error: 4.09
R-squared: 0.752, Adjusted R-Squared 0.744
F-statistic vs. constant model: 90, *p-value = 7.38e-27*
There are two different p values one can see, ones the individual ones and ones a p-value for all of them together? What is the difference between a f test and f statistic?also why dont we calculate the p-value for a t test?what the difference between f and t test?
According to the documentation the first p value is: p-value for the F statistic of the hypotheses test that the corresponding coefficient is equal to zero or not. For example, the p-value of the F-statistic for x2 is greater than 0.05, so this term is not significant at the 5% significance level given the other terms in the model.]
And the second p-value: p-value for the F-test on the model. For example, the model is significant with a p-value of 7.3816e-27.
Thanks so much!!!!

  0 Comments

Sign in to comment.

1 Answer

Answer by Shashank Prasanna on 21 Jul 2014
 Accepted Answer

These phrases have standard meaning in Statistics which is consistent with most literature you may find on Linear Regression. In short the t-statistic is useful for making inferences about the regression coefficients. This is the one right next to your coefficients, x1 x2 in the output. F-statistic is the test statistic for testing the statistical significance of the model.
Here is some explanation that might help, however I'd urge you to go through other textbook/material on this topic:

  4 Comments

Show 1 older comment

Another question regarding the documentation (another mistake?):

            Estimate      SE        tStat       pValue 
                   ________    _______    ________    ________
    (Intercept)      62.405     70.071      0.8906     0.39913
    x1               1.5511    0.74477      2.0827    0.070822
    x2              0.51017    0.72379     0.70486      0.5009
    x3              0.10191    0.75471     0.13503     0.89592
    x4             -0.14406    0.70905    -0.20317     0.84407

Ones it says:"You can see that for each coefficient, tStat = Estimate/SE. The $p$ -values for the hypotheses tests are in the pValue column. Each $t$ -statistic tests for the significance of each term given other terms in the model. According to these results, none of the coefficients seem significant at the 5% significance level, although the R-squared value for the model is really high at 0.97. This often indicates possible multicollinearity among the predictor variables.

(the link you posted)

On another page in the documentation it says:

http://www.mathworks.de/de/help/stats/linear-regression-model-workflow.html#bs9kfzr

Estimated Coefficients: Estimate SE tStat pValue _______ ______ ______ ________

    (Intercept)       118.28      7.6291      15.504    9.1557e-28
    sex_m            0.88162      2.9473     0.29913       0.76549
    age              0.08602     0.06731       1.278       0.20438
    wgt            -0.016685    0.055714    -0.29947       0.76524
    smoke_Yes          9.884      1.0406       9.498    1.9546e-15

Number of observations: 100, Error degrees of freedom: 95 Root Mean Squared Error: 4.81 R-squared: 0.508, Adjusted R-Squared 0.487 F-statistic vs. constant model: 24.5, p-value = 5.99e-14

The sex, age, and weight predictors have rather high $p$ -values, indicating that some of these predictors might be unnecessary.

I thought when the p value of the t-stat is under 0.05 then these coefficient has an significant effect on the model (as explained in the second link). The first example says nearly the opposite? All values are over 0.05...?

Thank you!

The p-value tests the null hypothesis that the coefficient is equal to zero, or has no effect on the response. In the first example p>0.05 means you can't reject the null hypothesis that the coefficients are zero. But since the model is able to explain a lot of the variance (high R-squared) your variables maybe collinear. Which is precisely what is done in the next example, go through the stepwise example next:
Tania, I recommend some background reading on linear regression and statistics, otherwise your models and its interpretations may be dangerous to whoever will use it.
Also, if you have a new question, please close this questions (accept answer) and post your new question separately. That way you will have more eyes looking at it.
Okay cool, i think I missunderstood the sentence in the first example. So the null hypothesis is not rejected here, that makes sense.Thank you :)

Sign in to comment.