Rescaling curves by a factor to a single curve

20 views (last 30 days)
Hello,
I have a set of curves that I want to rescale into a single master curve. How can obtain the scaling coefficient?
I attached herewith my plot and the result I want to have. Thank you
  2 Comments
John D'Errico
John D'Errico on 7 Jul 2021
A simple proportional re-scaling would rescale x or y linearly, so of the general form
x_hat = a*x
Thus, you could take any curve, and convert it to a master, if you only knew a. At least that is what a rescaling would imply.
This is not a problem that polyfit can handle though.
At the same time, it looks like you are using a fit to a piecewise linear function, as well as an exponential. In both cases, these are not simple linear rescalings of the form you have requested. My guess is, that was all you knew how to do, or what you had a tool to accomplish, and is not really what you want. Is that true?
Kabir Shariff
Kabir Shariff on 7 Jul 2021
I want to accomplish two steps;
  1. to rescale my data into a single master curve (say using data A as the reference data) with a constant as shown in Fig 2.; and
  2. use curve-fitting tool (piecewise, exponential, power law ...) to obtain the best equation to represent the master curve
How do I obtain a master curve of my data set ( A, B,C,D,E) ?
NB: the 2nd figure is a reference case not my data. I want to have something similar
Thank you.

Sign in to comment.

Accepted Answer

Kabir Shariff
Kabir Shariff on 11 Jul 2021
Edited: Kabir Shariff on 11 Jul 2021
Hello, thank you for the assistance. I finally found a way to rescale the data and then use a fit function as suggested earlier.
I use the fminsearch optimization with norm as the function onject to find the coefficent between two data set.
I chose rB as my reference data and find the coefficient using the code
scale_coef = fminsearch(@(c) norm(rA./c-rB,2),1)
the code return a coefficent value
scale_coef =
0.7119
The coefficent for each data set is determine wrt rB and then multiplied to all data points to rescale the data
Then fitted with the model equation;
r = a*(1-exp(-b*xA)) + c*xA + d;
General model:
f(x) = a*(1-exp(-b*x)) + c*x + d
Coefficients (with 95% confidence bounds):
a = 5.394 (5.181, 5.608)
b = 0.3305 (0.2965, 0.3646)
c = 0.1197 (0.1116, 0.1278)
d = 0.0337 (-0.04226, 0.1097)
Goodness of fit:
SSE: 22.21
R-square: 0.9921
Adjusted R-square: 0.992
RMSE: 0.3324

More Answers (2)

KSSV
KSSV on 7 Jul 2021
Read about polyfit.

John D'Errico
John D'Errico on 7 Jul 2021
Edited: John D'Errico on 7 Jul 2021
You are looking to fit these curves using some nonlinear model. But you have not even said what model you want to use, just that it will allow you to fit all curves in your family of data to the chosen model.
It looks like each curve might pass through 1 at x == 0. I suppose if you consider some base model, perhaps:
r = 1 + log(x + 1)
then we can view each curve as a re-scaled version, perhaps of the general family:
r(x,a,b) = 1 + a*log(b*x + 1)
Each curve still has the same property, that at x==0, it will have r(0,a,b) = 1. In that curve family, we can assume any log base you want. If you feel more comfortable using log10, or a natural log, even log2, that is fine. Whichever seems most appropriate is fine.
You don't provide any data, so it is difficult to be sure if that model would be appropriate, but it should have the general shape you showed. Now what you need to do is use a tool like the Curve Fitting toolbox. You could also use nlinfit, if you have the stats TB, or lsqcurvefit, if you have the optimization TB. IMHO, the CFTB is slightly better in my opinion for this sort of problem, because the interface is designed to solve this sort of problem very naturally. The CFTB also directly gives you parameter uncertainties, and people seem to like to see them. The other TBs are not even remotely difficult to use though.
For each curve, you will perform the fit, using a fittype of the form:
ft = fittype('1 + a*log(b*x + 1)','indep','x');
Then you would use the function fit to estimate a and b from each curve. So each curve would have values of a and b, specific to that curve. Again, since I lack your data, I can't directly show how that might work. So here is some fudged, fake data.
x = [1.4631 1.9048 4.1775 8.2032 9.4854 12.221 13.587 13.701 14.363 14.473];
r = [ 1.9164 2.1207 2.9809 3.9242 4.0916 4.5444 4.7152 4.7689 4.8723 4.8862];
plot(x,r,'mo')
We would fit the data with fit...
mdl = fit(x',r',ft,'start',[1 1],'lower',[.001 .001])
mdl =
General model: mdl(x) = 1 + a*log(b*x + 1) Coefficients (with 95% confidence bounds): a = 2.059 (1.947, 2.171) b = 0.3803 (0.338, 0.4225)
That would employ a natural log. The result will be parameters a and b, such that the curve can now be "calibrated" to fit the master model. For this data set, a was approximately 2.059, and b was approximately 0.3803. If you now wish to re-scale the data so it ll overlays onto the master curve, you would then do it as:
% first, plot the master model in blue:
fplot(@(x) 1 + log(x + 1),[0 5])
hold on
% re-scaled data:
rhat = 1 + (r - 1)/mdl.a;
xhat = x*mdl.b;
% in red, the rescaled data.
plot(xhat,rhat,'ro')
As you can see, the data now lies on top of the master curve. In this case, since the noise was pretty low, the curve fit was quite good.
I could have used other tools to do the fit as well, but the CFTB is a good choice for this problem. It may also be that my choice of model is not the best one for your data, but since I don't have your data, that is purely a wild guess on my part.
  3 Comments
John D'Errico
John D'Errico on 7 Jul 2021
Edited: John D'Errico on 7 Jul 2021
Looking at your data, this was my initial guess at a model:
ft = fittype('1 + a*log(b*x + 1)','indep','x');
mdlA = fit(xA',rA,ft,'start',[1 1])
mdlA =
General model:
mdlA(x) = 1 + a*log(b*x + 1)
Coefficients (with 95% confidence bounds):
a = 1.608 (1.476, 1.74)
b = 0.9141 (0.6778, 1.15)
plot(mdlA)
hold on
plot(xA,rA,'bo')
title 'A curve'
So not a very good fit there. Shall I compare it to one of OJ Simpson's gloves?
mdlB = fit(xB',rB,ft,'start',[1 1])
mdlB =
General model:
ans(x) = 1 + a*log(b*x + 1)
Coefficients (with 95% confidence bounds):
a = 2.701 (2.446, 2.956)
b = 0.6362 (0.4636, 0.8087)
plot(mdlB)
hold on
plot(xB,rB,'bo')
So no better on the B curve.
Indeed, if I look at the second derivative for your curve data (as generated from a spline fit) it appears to be virtually constant at zero above x == 5 or 10.
That in turn suggests a model that is asymptotic to a straight line. An arc of a hyperbola is one such curve. You might consider a model like:
r = a*(1 - exp(-b*x)) + c*x + 1
which also forces the curve to pass through (x,r) = (0,1).
ft = fittype('a*(1 - exp(-b*x)) + c*x + 1','indep','x');
mdlB = fit(xB',rB,ft,'start',[1 1 1])
mdlB =
General model:
mdlB(x) = a*(1 - exp(-b*x)) + c*x + 1
Coefficients (with 95% confidence bounds):
a = 3.837 (3.731, 3.943)
b = 0.5731 (0.5054, 0.6409)
c = 0.1409 (0.1368, 0.1449)
plot(mdlB)
hold on
plot(xB,rB,'bo')
That works considerably better, but still shows significant lack of fit in my opinion.
mdlC = fit(xC',rC,ft,'start',[1 1 1])
mdlC =
General model:
mdlC(x) = a*(1 - exp(-b*x)) + c*x + 1
Coefficients (with 95% confidence bounds):
a = 4.034 (3.762, 4.306)
b = 0.5098 (0.4266, 0.593)
c = 0.3965 (0.3777, 0.4153)
That C data fits this model even a bit better...
Some of the problem may arise from my insistence the curve pass through that point, which seemed reasonable initially when I looked at your plots. That was just a guess.
ft2 = fittype('a*(1 - exp(-b*x)) + c*x + d','indep','x');
mdlB = fit(xB',rB,ft2,'start',[1 1 1 1])
mdlB =
General model:
mdlB(x) = a*(1 - exp(-b*x)) + c*x + d
Coefficients (with 95% confidence bounds):
a = 2.479 (2.375, 2.582)
b = 0.263 (0.2408, 0.2852)
c = 0.1324 (0.1305, 0.1344)
d = 2.621 (2.507, 2.736)
plot(mdlB)
hold on
plot(xB,rB,'bo')
This was remarkably better.
At x == 0, the curve passes through r == d. So for the B curve, that would be 2.621. You can probably come up with some justitfication for that final model, once we see that it fits so well. Think of the response as two super-posed systems, one of which dies off rather quickly, and the other is a purely linear thing.
Kabir Shariff
Kabir Shariff on 8 Jul 2021
Hello,
Thank you very much fot the assitance. I have tried different model expression in the cftool app but the 4-term custom exponential expression seems to match better will all the data sets.
r ='a*(1-exp(-b*xA)) + c*xA + d'; % model equation
ft = fittype(r,'independent','xA') % % fit function
mdlA = fit(xA,rA,ft,'start',[1 1 1 1]) % applies the fit model ft on data xA, rA, with initial assumption [ 1 1 1 1])
mdlB = fit(xB,rB,ft,'start',[1 1 1 1])
mdlC = fit(xC,rC,ft,'start',[1 1 1 1])
mdlD = fit(xD,rD,ft,'start',[1 1 1 1])
mdlE = fit(xE,rE,ft,'start',[1 1 1 1])
For each data set, a constant a,b,c & d are defined/optimize to fit the sample data (fit A, fit B ..).
My question now is how can I rescale the data point to superimpose into a single master curve.
Using the data say A as a reference curve, then dividing/miltiplying the data B,C,D & E by a constant (or a function ) as shown in the figure above.
Since all data sets have the same shape (although not same number of points), I should expect all data to follow the same patten.
Finally I can be able to apply the cfit model on the master curve.
Thank you

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!