I tried to fit the sin(x) function with 7th degree polynomial using matlab inserted polyfit and fminunc respectively
fminunc Converging at a strange point
2 views (last 30 days)
Show older comments
I compared fminunc(least square fitting) with polyfit, but the result was not coincide:
here is my code:
% clear session, close plots, clear screen
clear all; close all; clc
% data for regression
xm = linspace(0,4*pi,10);
ym = sin(xm);
% initial parameter guess
p0 = rand(1,8);
% define objective function (scaled sum of squared errors)
objective = @(p) sum((yp(p)-ym).^2);
disp(['Initial Objective: ' num2str(objective(p0))])
% optimize with fmincon
%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);
%
plot(xm,ym,'ro')
hold on
plot(xm,yp(popt),'gs')
% compare with polyfit
p = polyfit(xm,ym,7);
y1 = polyval(p,xm);
plot(xm,y1,'o')
legend('measured','fminunc','polyfit')
ylabel('y')
xlabel('x')
yp(popt)
y1
function cc = yp(p)
xm = linspace(0,4*pi,10);
cc=p(8);
for i=1:7
cc = cc+p(i).*xm.^i;
end
end
I check the gradient of the function and it is far from reaching the condition of convergence.
I don't know why it converges to such a strange solution, could someone explain or help to adjust the parameters?
Accepted Answer
Bruno Luong
on 21 Aug 2022
@zhou caiwei "1. MATLAB has a variety of solutions, if one of them not work, try others until the prolem is solved;
2.MATLAB is complex,details make a difference"
I think you miss the most important lesson:
3. Makesure your parameter scales are comparable when working in numerical minimization (and linear system as well):
% clear session, close plots, clear screen
clear all; close all; clc
% data for regression
xm = linspace(0,4*pi,10);
ym = sin(xm);
% initial parameter guess
p0 = rand(1,8);
% define objective function (scaled sum of squared errors)
[xmax,xmin] = bounds(xm);
xmedian = (xmax+xmin)/2;
xdiff = (xmax-xmin)/2;
xn = (xm-xmedian)/xdiff;
objective = @(p) sum((ypn(p,xn)-ym).^2);
disp(['Initial Objective: ' num2str(objective(p0))])
% optimize with fmincon
%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[pnopt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);
%
plot(xm,ym,'g-')
hold on
plot(xm,ypn(pnopt,xn),'ro')
% compare with polyfit
p = polyfit(xm,ym,7);
y1 = polyval(p,xm);
plot(xm,y1,'+b')
legend('measured','fminunc','polyfit')
ylabel('y')
xlabel('x')
ypn(pnopt,xn)
y1
function cc = ypn(pn,xn)
cc=pn(8);
for i=1:7
cc = cc+pn(i).*xn.^(i);
end
end
2 Comments
Bruno Luong
on 22 Aug 2022
Edited: Bruno Luong
on 22 Aug 2022
Note that a polynomial function on xn is still a polynomial in xm, since xn is affine trasformation from xm.
The coefficients of optimal xm polynomial can be derived from pnopt returned by fminunc.
More Answers (3)
John D'Errico
on 20 Aug 2022
Edited: John D'Errico
on 20 Aug 2022
This is a common mistake made by people. x varies over a moderately large range. That means the POWERS of x will get very large. And in turn, that means the coefficients of those terms will need to have a large range of values in magnitude. And in turn, that helps to make this a poorly posed problem. Both fminunc and polyfit will have serious numerical problems, despite this being a simple problem, in theory. Remember that you are using only double precision arithmetic.
So what are you trying to do?
xm = linspace(0,4*pi,10);
ym = sin(xm);
Ok, a REALLY bad idea is to try to fit a model with 8 unknown coefficients, using ONLY 10 data points. That is just silly.
xm = linspace(0,4*pi,100);
ym = sin(xm);
So xm grows as large as 4*pi. The series itself SHOULD be:
syms X
Taylor_sine = taylor(sin(X),'order',8)
As you can see though, if x is on the order of
4*pi
This will be all computable in double precision arithmetic, although you are getting close to the edge. We can tell that by looking at the condition number for the asociated Vandermonde matrix.
cond(xm'.^(0:7))
So the solution would completely fail in SINGLE precision arithmetic, but it will survive in double precision arithmetic. Luckily, your data has no noise in it at all.
So what is the real reason why things seem to be performing poorly?
First, see how polyfit does:
format long g
poly7 = polyfit(xm,ym,7)
fplot(@(x) polyval(poly7,x),'r',[0,4*pi'])
hold on
fplot(@sin,'b',[0,4*pi])
legend('Polyfit approximation to sin(x)','sin(x)')
hold off
So the fit itself looks to be not obscenely terrible, yet I see these coefficients are nothing close to what I would expect for a true sine series. (I'm actually surprised the fit is as close as it came.) There are significant areas of lack of fit though. (Look at what happens near 0 and 4*pi, as well as how the fit misses the peaks and valleys.)
Next, remember that in order to get any true convergence for a SINE series, going out as far as x=12, you would need many more terms than just an x^7 term. And that means your polynomial must be significantly high order than only a 7th degree polynomial. But if you were to use a higher order polynomial than that, then even double precision arithmetic will fail you. Just for kicks, lets look at what the true truncated Taylor series produces for that function over that domain.
fplot(Taylor_sine,'r',[0,4*pi])
hold on
fplot(@sin,'b',[0,4*pi])
legend('Taylor series approximation to sin(x)','sin(x)')
hold off
Do you see the problem? You cannot use such a low order polynomial to well approximate the sine function over that domain. It is not even close to convergent, beyond roughly 2*pi, but you are going out as far as 4*pi. Had I limited the domain to even 2*pi, the approximation would still be seen to have numerical problems.
Fminunc will of course have serious numerical problems too, but I should be able to stop here, because fminunc is a completely different algorithm, and it will have its own completely different problems. You should expect numerical issues no matter what you do in this problem. If you do get a fit, it will look nothing like what you would expect the truncated sine series coefficients actually should be. If you use a higher order model, then the fit will be complete crap.
Added:
Fminunc. What happens? First, if you use only 10 data points, expect it to be complete crap. And, you used random start point for the initial value. SIGH. Garbage in, garbage out.
WHen you start with a REALLY bad inital value for the coefficients, then the optimizer see large residuals to sttart with. This helps the optimizer to get lost, diverging. For example, first with only the 10 points you had:
xm = linspace(0,4*pi,10);
ym = sin(xm);
format short
p0 = rand(1,8)
These are terrible start points. RANDOM START POINTS ARE OFTEN A BAD IDEA.
% far simpler than using a function that you wrote. LEARN TO USE EXISTING CODE!!!!
yp = @(P) polyval(P,xm);
Before we do a fit, look at what yp returns:
yp(p0)
Do you see the initial polynomial is complete CRAPOLA? Any random polynomial will be that. Remember what I said. You would expect this to be true. These are TERRIBLY poor starting values for a polynomial fit. So the solver can get lost.
% define objective function (scaled sum of squared errors)
objective = @(p) sum((yp(p)-ym).^2);
disp(['Initial Objective: ' num2str(objective(p0))])
Do you see that the initial objective function is HUGE at the start point?
You have too little data. You have terrible start points. Any optimizer will have problems.
% optimize with fmincon
%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);
format short g
popt
fval
So even at the final point, the objective is relatively huge. Remember that fminunc is NOT a least squares solver, but a general optimizer. So that too is an issue.
Perhaps a better choice of starting values would simply be:
Myp0 = 1./factorial(7:-1:0).*repmat([1 .01],1,4)
yp(Myp0)
These are at least a little smaller numbers, so the solver may have an easier time of it. But what are the default values for the tolerance? TolFun for fminunc is 1e-6. So once the objective function has been reduced from 1e15 to 1e6, and the gradient is now relatively small, fminunc thinks it is done.
options.TolFun = 1e-13; % since we are so far away initially, we will need a massive reduction.
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,Myp0,options);
popt
fval
Still crap. But this may just be getting at the idea that fminunc is not designed to solve that class of problem, at least not without problems of its own.
fplot(@(x) polyval(popt,x),'r',[0,4*pi'])
hold on
fplot(@sin,'b',[0,4*pi])
plot(xm,ym,'go')
legend('Fminunc approximation to sin(x)','sin(x)','data')
hold off
It fit the first couple of points ok, but then things got out of hand. Poor starting values are a problem, and mine were not much better. It looks like the solver starts to effectively chase the last data point, because that is the most important with the largest residual from the starting values. But then it gets lost.
5 Comments
Torsten
on 20 Aug 2022
Edited: Torsten
on 20 Aug 2022
xm = linspace(0,4*pi,10).';
ym = sin(xm);
% initial parameter guess
p0 = rand(1,8);
% define objective function (scaled sum of squared errors)
yp = @(p) sum(p.*xm.^(0:7),2);
objective = @(p) yp(p)-ym;
options = optimset('TolFun',1e-8,'TolX',1e-8);
[p1,res] = lsqnonlin(objective,p0,[],[],options)
y1 = yp(p1);
p2 = polyfit(xm,ym,7);
p2flip = fliplr(p2)
y2 = polyval(p2,xm);
plot(xm,ym)
hold on
plot(xm,y1)
hold on
plot(xm,y2)
0 Comments
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!