71 views (last 30 days)

Show older comments

I have two signals in MATLAB, say

a = randn(1,1e6) + randn(1,1e6)*exp(-1i * 2*pi * 1.1);

b = randn(1,1e6) + randn(1,1e6)*exp(-1i * 2*pi * 1.4);

I am finding the correlation between them as follows:

R=corrcoef(a,b);

r = R(2,1);

Now each time I run my code, the correlation coefficient is different. I even tried to increase the number of samples (from 1e6 to higher values) but that didn't work. Is there some other way to find the correlation coefficient between such signals?

John D'Errico
on 1 Oct 2015

Edited: John D'Errico
on 1 Oct 2015

When you generate random data, you will NEVER be able to know the EXACT correlation coefficient. The data was composed of random numbers. Perhaps a related example will help.

What is the mean of a uniform random variable, sampled from the interval [0,1]? We know this of course to be 0.5. But try it?

n = 1e6;

x = rand(1,n);

mean(x)

ans =

0.50032

x = rand(1,n);

mean(x)

ans =

0.50006

x = rand(1,n);

mean(x)

ans =

0.4997

Hmm. I got a different number each time, and none of them were the expected value of 0.5. Close, but not exactly so.

In fact, statistics 101 will teach us that as the sample size increases, the mean will in fact approach the known expected value, but we will never expect an exact result. The sample correlation coefficient taken from data has the same property. It will be an estimate of that parameter, but it will generally never be the true value, that one could compute using theory. As well, the value we get will vary, due to the randomness of the sample. This is all exactly as we expect.

John D'Errico
on 2 Oct 2015

It looks like you misunderstand that there is a difference between a parameter estimated from a SAMPLE of some random variable, compared to an expected value computed for the entire population.

This is exactly why I gave the example of the mean of a random sample. The EXPECTED value of that mean is 0.5. However, the computed mean of those random samples will essentially never be exactly the expected value. The same thing applies to a correlation coefficient. The computation that you show is no more exactly true than the computation I show for the mean.

So, again, the point is there is a difference between a sample mean and a population mean. Your problem seems to arise from the use of similar names for the parameters.

Jan
on 1 Oct 2015

Jan
on 1 Oct 2015

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!