Multivariate analysis advice needed

Question

Brian Scannell on 5 Jul 2016

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/293834-multivariate-analysis-advice-needed

Commented: Star Strider on 6 Jul 2016

I have next to no knowledge of statistics, so please bear with me.

I have a large dataset of three physical parameters, A, B and C. Established convention based on multiple empirical studies is that A is a linear function of B. When I plot my data as a scatter plot of A against B and shade each point according to C, the linear relationship between A and B is clear, but it also appears that A is dependent on C (or at least that higher values of A correlate with higher values of C for the same value of B).

What I now need to do is: (a) quantify the empirical relationship of A as a function of both B and C; (b) provide some indication of confidence that C genuinely is influencing A e.g. demonstrate a very low probability that A is fully independent of C; and (c) quantify the improvement in the accuracy of the predicted value of A when described as a function of B and C rather than solely as a function of B.

My knowledge of statistics is woefully lacking for this task. I have heard of multiple linear regression and ANOVA, but have never attempted either and don't really know what they are or how they differ.

I'm hoping that by defining the objectives as clearly as I can, someone will be able to point me in the right direction as to which tools to use and how to apply them.

One final bit of information that may be relevant, my sample sizes for A, B and C run into tens of thousands of measurements and the observations are (roughly) coincident in space and time, so there's no shortage of data. I may want to sub-sample to explore whether the relationship is affected by a fourth environmental factor, but even then I will have thousands of observations for each sub-sample.

All thoughts / comments / suggestions welcome.

Regards, Brian

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Star Strider on 5 Jul 2016

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/293834-multivariate-analysis-advice-needed#answer_227778

Open in MATLAB Online

I would use the Statistics and Machine Learning Toolbox regress function. It should give you everything you need.

I would ask for at least the first two outputs:

[b,bint] = regress(y,X);

The ‘bint’ matrix will give the confidence intervals (these are 95% by default) for each parameter. This will tell you if both parameters are needed in the regression, since if the confidence interval includes zero (shortcut is that the confidence limits are of opposite signs), that parameter is not significantly different from zero, and is not needed in the regression. Otherwise, the estimated parameter is significantly different from zero, and must be kept in the regression.

An alternative (or additionally) if you want to test which of the parameters are needed in the linear regression is the stepwisefit function.

I would review the documentation for both, and see which one best fits your needs.

2 Comments
Show NoneHide None

Brian Scannell on 6 Jul 2016

Thanks Star Rider - much appreciated.

Star Strider on 6 Jul 2016

My pleasure!

Sign in to comment.

Multivariate analysis advice needed

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments
Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

Multivariate analysis advice needed

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

2 Comments Show NoneHide None

More Answers (0)

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

2 Comments
Show NoneHide None