How to add two different linear fits to the same graph
26 views (last 30 days)
Show older comments
I have tried looking for built-in plotting options to create a bi-linear fit for data but it appears this is something I probably have to code myself. Below I attached an example of the data I am trying to fit:
I want to be able to fit a linear regression to the 'flat' and 'steep' regions of this graph without having to manually create two data sets, regressions, and a combined plot. The point of this is to find the intersection point of the two linear regressions.
Perhaps the best way of solving the problem is to find where concavity/double derivative is maximized, however I really do not know how to go about coding this. Any ideas or help would be greatly appreciated.
4 Comments
dpb
on 27 Aug 2018
How about attach a .mat file with a couple of your curves to illustrate with? I did find my piecewise m-file (it was in an old installation folder).
I don't know how well the linear section is going to work with the RH section of your data; it looks more like a quadratic might be in order.
Accepted Answer
dpb
on 27 Aug 2018
Edited: dpb
on 29 Aug 2018
function y=piecewise(coef,x)
% return piecewise linear fit given [b1,m1,c,m2] as coefficient array and vector x
% c is breakpoint, b1 is intercept of first section, m1,m2 are two segment slopes
b1=coef(1); m1=coef(2); % reduce parentheses clutter....
c=coef(3); m2=coef(4);
ix=x>=c; % location > breakpoint
y(ix)=[b1+c*(m1-m2)+m2*x(ix)];
y(~ix)=[b1+m1*x(~ix)];
y=reshape(y,size(x));
end
Example use; contrived example but works well in general with real data in my experience...
x1=1:5; x2=x1+5; x=[x1 x2];
y=[polyval([1 1],x1) polyval([3 -10],x2)]+rand(size(x));
plot(x,y,'*')
hold on
b=nlinfit(x,y,@piecewise,[1 1 4 3])
b =
1.1746 1.0519 5.5482 3.1284
xh=linspace(x(1),x(end));
yh=piecewise(b,xh);
plot(xh,yh)
ADDENDUM
[dpb wrote earlier comment -- "I've never sat down and done the algebra to write as anything but two linear segments but it would be possible to do the same thing but join a straight line and a quadratic..."]
Well, it dawned on me it isn't difficult to do if one limits the implementation to the one specific case of the quadratic portion being arbitrarily set to one section or the other. Since your data have the nonlinear portion to the right of the breakpoint, I chose to do that case--
function y=piecelinquad(coef,x)
% return piecewise mixed linear/quadratic fit given [b1,m1,c,m2,m3] as
% coefficient array and vector x over which to evaluate function
% c is breakpoint, m1, b1 are slope, intercept of first section, and
% m1,m2 are second segment quadratic, linear coefficients
% The quadratic section as coded is always for x>c, linear for x<=c
%
% dp bozarth 26Aug2018
b1=coef(1); m1=coef(2); % reduce parentheses clutter....
c=coef(3); p=coef(4:5);
ix=x>c; % location > breakpoint
y(ix)=polyval([p b1+c*m1],x(ix)-c);
y(~ix)=b1+m1*x(~ix);
y=reshape(y,size(x));
end
Testing on a couple of artificial datasets such as the one used above, it seems to work well in locating the breakpoint and minimizing overall SSE; I noticed there may be a tendency to place the breakpoint somewhat farther inside the linear section as the overall error can be less with a little overshoot at the breakpoint by the quadratic. All these tests were with a small number of points, however, the more extensive datasets you have could be better behaved.
It raises the question, however, of whether, depending on the actual problem constraints of just using smoothing spline or similar?
4 Comments
dpb
on 28 Aug 2018
I've never sat down and done the algebra to write as anything but two linear segments but it would be possible to do the same thing but join a straight line and a quadratic; what changes is then the second-segment intercept is dependent on that functional as well as function of the intersection.
Roughly locating the breakpoint would probably be very useful in determining an estimate for the breakpoint location starting value; typically ime it is pretty robust although if very far off one can get some warnings about ill-conditioned or rank deficiencies; generally these can be ignored as the routine ends up finding a valid solution and can be checked as above by plotting results and by re-running with the estimates as input guesses almost always the warnings will go away. Only if the data are not at all suited to the model and/or the initial guess is just totally out in left field will it fail with any datasets I've ever tried.
More Answers (0)
See Also
Categories
Find more on Interpolation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!