matlab some polyfit problem

23 views (last 30 days)
peter huang
peter huang on 17 Mar 2023
Commented: John D'Errico on 19 Mar 2023
I am testing the accuracy of polyfit to calculate simple trends. I found that when I use the mu parameter, the coefficients of the equation become abnormal, but there is no problem when plotting the fitting points. I want to know why there is an issue with the equation coefficients when using the mu parameter.
code is
clear all ; clc ; clf
x = linspace(0, 10, 50);
y = 2.*x.^3+5.*x.^2+1.*x+3 ;
%%
p1 = polyfit(x, y, 3);
[p2,s2,mu2] =polyfit(x,y,3)
hold on
y1 = polyval(p1,x)
y2 = polyval(p2,x,s2,mu2);
%%
plot(x,y,'-');hold on
plot(x,y1, '.')
plot(x,y2, 'o')
legend('raw data','fit1','fit2(mu)')
p1 =
2 5.00000000000007 0.999999999999666 3.00000000000043
p2 =
52.6599154117175 309.766763848397 597.970066767299 383
Also, I would like to inquire whether there would be an impact on the calculation if I use x1 = 1: 1:length(y) instead of x as the parameter in polyfit.

Answers (2)

the cyclist
the cyclist on 17 Mar 2023
It is admittedly confusing, and I have fallen into this trap myself. The reason for the difference is described in the documentation, though:
[p,S,mu] = polyfit(x,y,n) performs centering and scaling to improve the numerical properties of both the polynomial and the fitting algorithm.
(I added the emphasis.)
If you perform the same scaling prior to the first call to polyfit, you will get the same result:
x = linspace(0, 10, 50);
x = (x - mean(x))/std(x);
y = 2.*x.^3+5.*x.^2+1.*x+3 ;
p1 = polyfit(x, y, 3)
p1 = 1×4
2.0000 5.0000 1.0000 3.0000
[p2,s2,mu2] =polyfit(x,y,3)
p2 = 1×4
2.0000 5.0000 1.0000 3.0000
s2 = struct with fields:
R: [4×4 double] df: 46 normr: 1.3608e-14
mu2 = 2×1
0 1.0000
  2 Comments
peter huang
peter huang on 19 Mar 2023
If I plan to perform polyfit on actual calculation results in the future, should I scale the data first to obtain the equation parameters for the fit and then bring these coefficients back to my original x values to obtain the equation?
the cyclist
the cyclist on 19 Mar 2023
@John D'Errico gave a thorough explanation of when scaling matters. You really should try to understand it. A shorthand heuristic for when scaling is important is if the highest polynomial powers of x are a very different order of magnititude from your lower ones (e.g. x itself).
I don't think there is ever meaningful harm in do the scaling. It just makes your code a bit more complicated, which is not a good thing if it wasn't necessary. These are the sorts of tradeoffs one makes in practical applications.

Sign in to comment.


John D'Errico
John D'Errico on 17 Mar 2023
Edited: John D'Errico on 17 Mar 2023
It is NOT abnormal. It is perfectly normal, and, in fact, perfectly correct. And the scaling is actually a valuable tool, when needed. It is just something you don't understand. The difference is huge. I'll start with your example problem.
x = linspace(0, 10, 50);
y = 2.*x.^3+5.*x.^2+1.*x+3;
Now fit, using polyfit. There is no real need to use centering and scaling. The trick is to look at the x variable. If you cube a number as large as 10, will there be numerical problems in double precision arithmetic? NO. I'll show as many digits as format will allow me to show.
format long g
[P0,S0] = polyfit(x,y,3)
P0 = 1×4
2 5 1 2.99999999999999
S0 = struct with fields:
R: [4×4 double] df: 46 normr: 6.84802985100727e-13
So the coeffficients are pretty much perfect, with only a tiny error. This is due to the linear algebra. Now do the same thing, but using centering and scaling.
[Pscaled,Sscaled,mu] = polyfit(x,y,3)
Pscaled = 1×4
52.6599154117171 309.766763848397 597.970066767299 383
Sscaled = struct with fields:
R: [4×4 double] df: 46 normr: 2.06965564368663e-12
mu = 2×1
5 2.97497545655373
And they look strange. But, so what? It fits the data. We can use syms to help us see how it works, and that we can reconstruct the original cubic. The error is actually slightly worse in the latter case, roughly 3x larger, but it is still tiny.
syms X
vpa(expand(dot(((X - mu(1))/mu(2)).^[3 2 1 0],Pscaled)))
ans = 
And syms has reconstructed the original polynomial, as it should. The trash in there is due to the symbolic toolbox to use higher precision, thus realizing the coefficients coming out of polyfit are really only accurate to around 16 digits anyway. You can't get more than that from double precision arithmetic. So this would have been more realistic:
vpa(expand(dot(((X - mu(1))/mu(2)).^[3 2 1 0],Pscaled)),16)
ans = 
As you can see, reconstructing the original polynomial actually loses some digits there. The crap in the low order bits starts to propagate out.
It is when you do need scentering and scaling that the difference becomes important.
x2 = linspace(0,1000)';
Here, x2 varies from 0 to 1000. Raising those numbers to powers as large as 6 will now become a problem. Some of the numbers are O(1), but others will be O(1e21). The linear algebra will just crap out here on a degree 6 polynomial. And this time, I'll add in a tiny amount of noise. Since y2 is on the order of 20, adding noise on the order of 1e-11 is pretty small.
y2 = (x2.^(7:-1:0))*[1e-21;1e-18;2e-15;3e-12;4e-9;5e-6;6e-3;7] + randn(size(x2))*1e-11;
plot(x2,y2)
A perfectly reasonable looking curve. But now polyfit will get upset at me.
[P7,S7] = polyfit(x2,y2,7)
Warning: Polynomial is badly conditioned. Add points with distinct X values, reduce the degree of the polynomial, or try centering and scaling as described in HELP POLYFIT.
P7 = 1×8
9.99999999397225e-22 1.00000000219041e-18 1.99999999618841e-15 3.00000000387077e-12 3.99999999781423e-09 5.00000000060431e-06 0.00599999999993011 7.00000000000354
S7 = struct with fields:
R: [8×8 double] df: 92 normr: 9.65941050730836e-11
I'm a little surprised the linear algebra did not just give up the ghost here. But it did succeed, despite some nasty complaints from the solver. If we allow polyfit to center and scale the variables, it gets happier.
[P7scaled,S7scaled,mu7] = polyfit(x2,y2,7)
P7scaled = 1×8
0.000185585518515412 0.00284984821250556 0.0221512486299421 0.11891570100952 0.495444690900551 1.68262485971035 4.81235449283815 12.0234374999988
S7scaled = struct with fields:
R: [8×8 double] df: 92 normr: 9.65942655702076e-11
mu7 = 2×1
500 293.045373493758
No complaints now. I could now recover the original coefficients using syms. (Actually, I don't need the syms here. but this makes for a pretty result.)
vpa(expand(dot(((X - mu7(1))/mu7(2)).^[7:-1:0],P7scaled)),16)
ans = 
The polynomial is the same (to within floating point crap.) But when you really need the centering and scaling, the difference can be important, because if the polynomial degree is high enough, you might find all you get out are NaNs otherwise.
Edit: (I just saw this additional question)
"Also, I would like to inquire whether there would be an impact on the calculation if I use x1 = 1: 1:length(y) instead of x as the parameter in polyfit."
OF COURSE IT MATTERS!!!!!!!!! How could it not?
If you change the x variable, then the polynomial coefficients will change! In the case you mention, the difference ends up being a different centering and scaling. So the difference is again a linear one in x. So you would expect a completely different set of polynomial coefficients. Since the transformation is a linear one (not surprisingly) the centered and scaled corefficients will be mathematically the same as the ones you would see from when you used linspace to create x.
  3 Comments
peter huang
peter huang on 19 Mar 2023
Regarding the issue of x usage: I actually want to apply it to calculate sea level trends. So, how should I decide on my use of x? Assuming I have 1000 calculated sea level trend results (y), should I use 1: length(y) or my datenum for x?
John D'Errico
John D'Errico on 19 Mar 2023
I'm a little confused as to your question. x is whatever you decide it will be, because it is you who will be using that model! But it hardly seems logical to call x just a set of arbitrary integers 1:length(y). You have the times when that information was collected. You can use them, though assuming the times are in a datenum form, you would be better to convert the times into days from the start of your sampling. Or you could scale things more appropriately. Perhaps the time should be in terms of fractions of a year. Again you can choose the units to use here.
All of this is the same reason why an astrophysicist uses light years (or even kilo or megaparsecs), instead of feet or millimeters to measure distances between astronomical objects. But all that matters is you use the correct (i.e., the same) units then when you later use that model.
But building a model with polyfit, as a function only of x=1:length(y) means the input has no units at all. How would you then sensibly use that model if you did that?

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!