Regression with dependent independent variables

4 views (last 30 days)
When undertaking a linear regression evaluating y as a function of x and x^3, is there a specific function within Matlab that takes account of the mutual dependence of the independent variables (x and x^3)?
I've tried searching the documentation but haven't found anything that specifically addresses this issue.
Regards, Brian
  1 Comment
Torsten
Torsten on 24 Aug 2017
Seen as functions, x and x^3 are independent since there are no constants a and b (not both zero) that make a*x+b*x^3 identically zero over a given interval.
Additionally, I don't understand what such a tool you request should be good for. You mean from the view of performance ?
Best wishes
Torsten.

Sign in to comment.

Accepted Answer

John D'Errico
John D'Errico on 24 Aug 2017
Edited: John D'Errico on 24 Aug 2017
Um, yes. ANY regression computation takes into account the relation between the variables.
So backslash, regress, lscov, lsqr, fit,etc. All take that into account.
I think your issue is that you don't understand how regression works. For that, there are entire courses that are taught.
Yes, it is true that x and x^3 are correlated with each other. Note that mathematical linear independence is not the same as saying the two variables are not related. That is, it is true that no linear combination of a*x + b*x^3 is zero EXCEPT for the case where a=b=0. So x and x^3 provide different information to the problem. Yet at the same time, it is not true that x and x^3 are orthogonal. There is essentially some overlap in what they do.
Yes, it is also true that there may be numerical issues. But that relationship between the variables is factored in when the regression is done. I really cannot say much more without specifics, or without writing a complete text on linear regression myself. Better that you read one, since many have been published. Perhaps a classic like that by Draper and Smith would be a good choice.
  2 Comments
Brian Scannell
Brian Scannell on 24 Aug 2017
Thanks for answering. Comments and suggestions much appreciated.
John D'Errico
John D'Errico on 24 Aug 2017
Edited: John D'Errico on 24 Aug 2017
I think you have gotten confused about regression, probably by a comment from a colleague. I seem to recall you saying that a colleague had said something about x and x^3, and now you seem to be worried.
While you should always beware of problems, odds are the inter-relationship between x and x^3 is not going to be an issue, at least if there are only two variables, and if you see no warning messages.
One test is to compute
cond([x(:),x(:).^3])
If you have other terms in the problem, they need to be included in there too. The best possible value here is 1. If you were seeing large numbers, REALLY large, on the order of 1e15 or so, you would start to get quite worried. Even 1e8 would be pretty bad. But for example, lets try it on some sample data.
x = randn(100,1);
cond([x(:),x(:).^3])
ans =
5.9516
So only 5.9. On the scale of how worried I would get here, 5.9 is laughably small.
Compare that to a different problem, with a much more complex model. Here, one with 16 polynomial terms in it.
x = rand(100,1);
cond([x.^[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]])
ans =
2.9659e+11
That is large enough that the coefficients may have have little real value. Any polynomial coefficients you estimate from that model would arguably be almost useless.
But for the model you have described? There is probably no big issue.

Sign in to comment.

More Answers (0)

Categories

Find more on Descriptive Statistics in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!