How to perform linear curve fitting based on distance from line instead of residuals

Question

Damon Aboud on 5 Jul 2022

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals

Answered: Image Analyst on 18 Jul 2022

I already know how to perform a classic 1st order curve fir using the polyval function. However, I find that the polyval curve fit does accurately follow my current data because it has a high slope and a round trend. Based on the sum of square errors approach used by polyval, it seems to be over-weighing certain points. Is there a way to do a 1st order curve fit that is guided by the total distance of the points from the line, instead of based on residuals?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 5 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1000990

Edited: John D'Errico on 5 Jul 2022

Open in MATLAB Online

This is the classic total least square problem. Sometimes it is called the errors in variables problem, sometimes known as orthogonal regression.

Lacking your data, I can't show how to do it using your data. But it is not difficult to do. Assume you have data in the form of vectors x and y. I'll make some up, here:

t = rand(100,1);

x = t/10 + randn(size(t))/10;

y = 5*t + randn(size(t))/10;

plot(x,y,'o')

axis equal

The actual slope of the curve should have been 50. What happens when we use polyfit?

P1 = polyfit(x,y,1)
P1 = 1×2
    2.9081    2.3460

The problem is, polyfit always produces biased estimates in these problems, where the slope will be biased towards zero. Do you see the polyfit extimated slope is very small? Instead of 50, we got a number around 2.9081 from polyfit.

The trick is to use total least squares. We can get the coefficients of the line from:

mux = mean(x);
muy = mean(y);
[~,~,V] = svd([x(:)-mux,y(:)-muy],0);
coef = V(:,end)
coef = 2×1
   -0.9998
    0.0176
AEst = -coef(1)/coef(2)
AEst = 56.7846

So despite the rather high presence of noise on our data, we get a slope estimate 56.7846, instead of the 2.9 value that polyfit predicted. That seems pretty reasonable.

The constant term in a regression model will be:

BEst = muy - AEst*mux
BEst = 0.0527

The line of best orthogonal fit is:

y = 0.0527 + 56.7846*x

1 Comment
Show -1 older commentsHide -1 older comments

Damon Aboud on 18 Jul 2022

Wow, thank you very much for your thorough help on this task!! This is super helpful, and it totally solves my problem.

Sign in to comment.

Answer 2

Sulaymon Eshkabilov on 5 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1000945

You can try a scaling factor here. Your data seems to be within [355 365]; thus, take out 355 from x and then perform the linear fit.

You can also try fitlm() that is very similar to polyfit().

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

Torsten on 5 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1000955

Edited: Torsten on 5 Jul 2022

The residuals are the sum of the vertical distances of the points from the line.

If you want to consider the orthogonal distances of the points from the line, you can proceed as follows:

Calculate the distance d_i of a point P = (x_i,y_i) from a line y = a*x+b. If you don't know how to do this: google is your friend.

Once you have this distance, you can formulate the optimization problem as

min_{a,b} sum_i d_i

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 4

Torsten on 6 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1001375

Edited: Torsten on 6 Jul 2022

Open in MATLAB Online

rng("default")

t = rand(100,1);

x = t/10 + randn(size(t))/10;

y = 5*t + randn(size(t))/10;

plot(x,y,'o')

sol = fminsearch(@(p)fun(p,x,y),[1 1])

sol = 1×2

54.4774 0.2832

fun(sol,x,y)

ans = 1.0309

hold on

plot(x,sol(1)*x+sol(2))

function res = fun(p,x,y)

a = p(1);

b = p(2);

for i = 1:numel(x)

xproj = (x(i)+a*(y(i)-b))/(1+a^2);

yproj = a*xproj+b;

dsquared(i) = (xproj-x(i))^2+(yproj-y(i))^2;

end

res = sum(dsquared);

end

1 Comment
Show -1 older commentsHide -1 older comments

Torsten on 6 Jul 2022

Edited: Torsten on 6 Jul 2022

Open in MATLAB Online

rng("default")
t = rand(100,1);
x = t/10 + randn(size(t))/10;
y = 5*t + randn(size(t))/10;
mux = mean(x);
muy = mean(y);
[~,~,V] = svd([x(:)-mux,y(:)-muy],0);
coef = V(:,end)
coef = 2×1
   -0.9998
    0.0184
AEst = -coef(1)/coef(2)
AEst = 54.4774
BEst = muy - AEst*mux
BEst = 0.2832

Sign in to comment.

Answer 5

Matt J on 18 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1009955

You can use linear2dFit() from this FEX package,

https://www.mathworks.com/matlabcentral/fileexchange/87584-object-oriented-tools-for-fitting-conics-and-quadrics

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 6

Image Analyst on 18 Jul 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1754180-how-to-perform-linear-curve-fitting-based-on-distance-from-line-instead-of-residuals#answer_1010105

In cases like this, with mostly vertical data, I just swap x and y and it seems to provide a good fit. Of course the residuals are horizontal rather than vertical but the fitted line seems reasonable. Of course you would only do this in certain situations where it physically makes sense, like fitting a line to the side of some blob in a binary image. You wouldn't do it for something like time series measurements. Of course you "orthogonal" residuals also would not have any theoretical reason for being right in that case either.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to perform linear curve fitting based on distance from line instead of residuals

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (5)

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to perform linear curve fitting based on distance from line instead of residuals

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (5)

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

1 Comment Show -1 older commentsHide -1 older comments

0 Comments Show -2 older commentsHide -2 older comments

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments