How to fit a general curve on different datasets ?

13 views (last 30 days)
Hi,
I have different datasets that I would like to fit together in order to create a general equation.
Yet I have fitted each dataset independently as : y_i=a_i + b_i * log10(x)
%% Example of one dataset
x= [1 10 100]';
y= [165 145 124]';
fo = fitoptions('Method','NonlinearLeastSquares',...
'Lower',[1],...
'Upper',[200],...
'StartPoint',[180 180]);
ft = fittype('a + b*log10(x)','dependent',{'y'},'independent',{'x'},'coefficients',{'a','b'},'options',fo);
myfit=fit(x(:),y(:),ft);
plot(x,y, 'ko', 'LineWidth',1.5, 'MarkerSize',7)
hold on
plot(myfit, 'k--')
But now I would like to test this thing : y_i = a_i + b * log10(x) with a general b for all the dataset. So I need to fit this new model into my whole dataset to get an estimation of b that match with every dataset.
Does anyone as an idea on how to do this in matlab ?
Thank you very much for your help.

Accepted Answer

Jeff Miller
Jeff Miller on 13 Sep 2019
You need to set up separate independent variables for each dataset, something like this:
%% Example of two datasets (3 scores each)
i1= [1 1 1 0 0 0]; % 1 indicates member of 1st data set
i2= [0 0 0 1 1 1]; % 1 indicates member of 2nd data set
x= [1 10 100 1 10 100]'; % concatenated x's for both data sets
y= [165 145 124 155 135 114]'; % concatenated y's for both data sets
% Then fit a model like this
'a1*i1 + a2*i2 + b*log10(x)'
But I don't think the 'fit' function will do that--at least not for more than 2 datasets. You could do it with regress, at least if you make a new independent variable:
log10x = log10(x);
% Then fit
'a1*i1 + a2*i2 + b*log10x'
MATLAB probably has some other nonlinear fitting routines that would also work, using the same basic indicator variable trick.
  6 Comments
Jeff Miller
Jeff Miller on 20 Sep 2019
A small simulation shows that John is exactly right:
% Simulation with equal x's in all groups, but random y's
i1= [1 1 1 0 0 0]'; % 1 indicates member of 1st data set
i2= [0 0 0 1 1 1]'; % 1 indicates member of 2nd data set
x = [1 10 100 1 10 100]'; % concatenated x's for both data sets
log10x = log10(x);
NTries = 10;
for i=1:NTries
y1 = randn(3,1);
a1 = regress(y1,[i1(1:3) log10x(1:3)]);
y2 = randn(3,1);
a2 = regress(y2,[i2(4:6) log10x(4:6)]);
y = [y1; y2];
a = regress(y,[i1 i2 log10x]);
dif = abs(a(3) - (a1(2) + a2(2))/2)
end
% The resulting difs are all miniscule, so in this case the slope of the
% combined data set matches the average of the individual slopes.
% Simulation with different x's in all groups, but random y's
for i=1:NTries
x = randn(6,1).^2;
log10x = log10(x);
y1 = randn(3,1);
a1 = regress(y1,[i1(1:3) log10x(1:3)]);
y2 = randn(3,1);
a2 = regress(y2,[i2(4:6) log10x(4:6)]);
y = [y1; y2];
a = regress(y,[i1 i2 log10x]);
dif2 = abs(a(3) - (a1(2) + a2(2))/2)
end
% The resulting dif2s are much larger, so in this case the slope of the
% combined data set need not match the average of the individual slopes.
yj
yj on 20 Sep 2019
Yes thank you very much for your answers !!

Sign in to comment.

More Answers (1)

Jon
Jon on 11 Sep 2019
If I am understanding what you are trying to do correctly, it seems like you should be able to concatenate all of your individual x values into one overall x vector, and similarly concatenate all of your y values into one overall y vector and then perform the curve fit on the combined set.
  9 Comments
John D'Errico
John D'Errico on 20 Sep 2019
Actually, this is the incorrect answer, since it will also use a common value for a.
Jon
Jon on 23 Sep 2019
Edited: Jon on 23 Sep 2019
I don't think that the approach I outlined will result in a common value of a, but I'm glad that you found a solution. (The approach I outlined may be buried under the older comments, you may not have seen it.)

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!