fminunc Converging at a strange point

Question

zhou caiwei on 20 Aug 2022

1
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1782490-fminunc-converging-at-a-strange-point

Commented: zhou caiwei on 24 Aug 2022

Accepted Answer: Bruno Luong

Open in MATLAB Online

I compared fminunc（least square fitting） with polyfit, but the result was not coincide:

here is my code:

% clear session, close plots, clear screen
clear all; close all; clc
% data for regression
xm = linspace(0,4*pi,10);
ym = sin(xm);
% initial parameter guess
p0 = rand(1,8);               
% define objective function (scaled sum of squared errors)
objective = @(p) sum((yp(p)-ym).^2);
disp(['Initial Objective: ' num2str(objective(p0))])
% optimize with fmincon
%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);
%
plot(xm,ym,'ro')
hold on
plot(xm,yp(popt),'gs')
% compare with polyfit
p = polyfit(xm,ym,7);
y1 = polyval(p,xm);
plot(xm,y1,'o')
legend('measured','fminunc','polyfit')
ylabel('y')
xlabel('x')
yp(popt)
y1
function cc = yp(p)
xm = linspace(0,4*pi,10);
cc=p(8);
for i=1:7
    cc = cc+p(i).*xm.^i;
end
end

I check the gradient of the function and it is far from reaching the condition of convergence.

I don't know why it converges to such a strange solution， could someone explain or help to adjust the parameters?

3 Comments
Show 1 older commentHide 1 older comment

Torsten on 20 Aug 2022

Edited: Torsten on 20 Aug 2022

Yes, (4*pi)^7 is large. Determine the coefficients for the normalized polynomial as "polyfit" does (i.e. use centering and scaling). Look up the polyfit documentation.

Bruno Luong on 20 Aug 2022

@Torsten I don't believe polyfit with single output center and normalize the data.

Sign in to comment.

Sign in to answer this question.

Answer 1

Bruno Luong on 21 Aug 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1782490-fminunc-converging-at-a-strange-point#answer_1030025

Open in MATLAB Online

@zhou caiwei "1. MATLAB has a variety of solutions, if one of them not work, try others until the prolem is solved;

2.MATLAB is complex，details make a difference"

I think you miss the most important lesson:

3. Makesure your parameter scales are comparable when working in numerical minimization (and linear system as well):

% clear session, close plots, clear screen

clear all; close all; clc

% data for regression

xm = linspace(0,4*pi,10);

ym = sin(xm);

% initial parameter guess

p0 = rand(1,8);

% define objective function (scaled sum of squared errors)

[xmax,xmin] = bounds(xm);

xmedian = (xmax+xmin)/2;

xdiff = (xmax-xmin)/2;

xn = (xm-xmedian)/xdiff;

objective = @(p) sum((ypn(p,xn)-ym).^2);

disp(['Initial Objective: ' num2str(objective(p0))])

Initial Objective: 24.1102

% optimize with fmincon

%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]

options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');

[pnopt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);

First-order Iteration Func-count f(x) Step-size optimality 0 9 24.1102 21.9 1 18 5.93391 0.0457562 8.36 2 27 4.35571 1 1.29 3 36 4.24105 1 0.663 4 45 4.11615 1 0.716 5 54 4.0728 1 0.639 6 63 4.04518 1 0.441 7 72 4.01684 1 0.429 8 81 3.93772 1 0.748 9 90 3.75595 1 1.36 10 99 3.33261 1 2.21 11 108 2.55486 1 2.95 12 117 1.60069 1 2.82 13 126 1.02686 1 1.63 14 135 0.882778 1 0.656 15 144 0.853617 1 0.411 16 153 0.824985 1 0.447 17 162 0.778002 1 0.383 18 171 0.739721 1 0.244 19 180 0.725867 1 0.101 First-order Iteration Func-count f(x) Step-size optimality 20 189 0.72418 1 0.0348 21 198 0.723996 1 0.0408 22 207 0.723747 1 0.0446 23 216 0.723067 1 0.0489 24 225 0.721679 1 0.0746 25 234 0.719377 1 0.0918 26 243 0.717205 1 0.0686 27 252 0.716303 1 0.0239 28 261 0.716157 1 0.0168 29 270 0.716126 1 0.0164 30 279 0.716064 1 0.016 31 288 0.715905 1 0.0173 32 297 0.715486 1 0.0316 33 306 0.714393 1 0.0547 34 315 0.71154 1 0.0919 35 324 0.704142 1 0.151 36 333 0.68524 1 0.243 37 342 0.638783 1 0.376 38 351 0.53513 1 0.536 39 360 0.35079 1 0.628 First-order Iteration Func-count f(x) Step-size optimality 40 369 0.145003 1 0.493 41 378 0.0395369 1 0.194 42 387 0.0205123 1 0.0256 43 396 0.0194762 1 0.0072 44 405 0.0194577 1 0.00729 45 414 0.019454 1 0.00738 46 423 0.0194275 1 0.00779 47 432 0.0193745 1 0.00827 48 441 0.0192193 1 0.00911 49 450 0.0188366 1 0.0128 50 459 0.0178676 1 0.0211 51 468 0.0156532 1 0.0317 52 477 0.0114315 1 0.0401 53 486 0.0060754 1 0.0357 54 495 0.00272485 1 0.017 55 504 0.00194909 1 0.00308 56 513 0.00189318 1 5.29e-05 57 522 0.00189209 1 3.02e-05 58 540 0.00189209 0.20013 2.45e-05 Computing finite-difference Hessian using objective function. Local minimum possible. fminunc stopped because it cannot decrease the objective function along the current search direction.

%

plot(xm,ym,'g-')

hold on

plot(xm,ypn(pnopt,xn),'ro')

% compare with polyfit

p = polyfit(xm,ym,7);

y1 = polyval(p,xm);

plot(xm,y1,'+b')

legend('measured','fminunc','polyfit')

ylabel('y')

xlabel('x')

ypn(pnopt,xn)

ans = 1×10

0.0002 0.9830 0.3491 -0.8826 -0.6179 0.6179 0.8826 -0.3491 -0.9831 -0.0002

y1

y1 = 1×10

0.0002 0.9830 0.3491 -0.8826 -0.6179 0.6179 0.8826 -0.3491 -0.9830 -0.0002

function cc = ypn(pn,xn)

cc=pn(8);

for i=1:7

cc = cc+pn(i).*xn.^(i);

end

2 Comments
Show NoneHide None

Bruno Luong on 22 Aug 2022

Edited: Bruno Luong on 22 Aug 2022

Note that a polynomial function on xn is still a polynomial in xm, since xn is affine trasformation from xm.

The coefficients of optimal xm polynomial can be derived from pnopt returned by fminunc.

zhou caiwei on 24 Aug 2022

awesome！

Sign in to comment.

Answer 2

John D'Errico on 20 Aug 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1782490-fminunc-converging-at-a-strange-point#answer_1029745

Edited: John D'Errico on 20 Aug 2022

Open in MATLAB Online

This is a common mistake made by people. x varies over a moderately large range. That means the POWERS of x will get very large. And in turn, that means the coefficients of those terms will need to have a large range of values in magnitude. And in turn, that helps to make this a poorly posed problem. Both fminunc and polyfit will have serious numerical problems, despite this being a simple problem, in theory. Remember that you are using only double precision arithmetic.

So what are you trying to do?

xm = linspace(0,4*pi,10);
ym = sin(xm);

Ok, a REALLY bad idea is to try to fit a model with 8 unknown coefficients, using ONLY 10 data points. That is just silly.

xm = linspace(0,4*pi,100);
ym = sin(xm);

So xm grows as large as 4*pi. The series itself SHOULD be:

syms X

Taylor_sine = taylor(sin(X),'order',8)

Taylor_sine =

As you can see though, if x is on the order of

4*pi
ans = 12.5664

This will be all computable in double precision arithmetic, although you are getting close to the edge. We can tell that by looking at the condition number for the asociated Vandermonde matrix.

cond(xm'.^(0:7))
ans = 3.9432e+08

So the solution would completely fail in SINGLE precision arithmetic, but it will survive in double precision arithmetic. Luckily, your data has no noise in it at all.

So what is the real reason why things seem to be performing poorly?

First, see how polyfit does:

format long g

poly7 = polyfit(xm,ym,7)

poly7 = 1×8

-5.57605384659417e-05 0.00245247657206736 -0.0411779076985449 0.325443180389698 -1.18418151628107 1.50773833414353 0.126075267256452 0.101900826945285

fplot(@(x) polyval(poly7,x),'r',[0,4*pi'])

hold on

fplot(@sin,'b',[0,4*pi])

legend('Polyfit approximation to sin(x)','sin(x)')

hold off

So the fit itself looks to be not obscenely terrible, yet I see these coefficients are nothing close to what I would expect for a true sine series. (I'm actually surprised the fit is as close as it came.) There are significant areas of lack of fit though. (Look at what happens near 0 and 4*pi, as well as how the fit misses the peaks and valleys.)

Next, remember that in order to get any true convergence for a SINE series, going out as far as x=12, you would need many more terms than just an x^7 term. And that means your polynomial must be significantly high order than only a 7th degree polynomial. But if you were to use a higher order polynomial than that, then even double precision arithmetic will fail you. Just for kicks, lets look at what the true truncated Taylor series produces for that function over that domain.

fplot(Taylor_sine,'r',[0,4*pi])

hold on

fplot(@sin,'b',[0,4*pi])

legend('Taylor series approximation to sin(x)','sin(x)')

hold off

Do you see the problem? You cannot use such a low order polynomial to well approximate the sine function over that domain. It is not even close to convergent, beyond roughly 2*pi, but you are going out as far as 4*pi. Had I limited the domain to even 2*pi, the approximation would still be seen to have numerical problems.

Fminunc will of course have serious numerical problems too, but I should be able to stop here, because fminunc is a completely different algorithm, and it will have its own completely different problems. You should expect numerical issues no matter what you do in this problem. If you do get a fit, it will look nothing like what you would expect the truncated sine series coefficients actually should be. If you use a higher order model, then the fit will be complete crap.

Added:

Fminunc. What happens? First, if you use only 10 data points, expect it to be complete crap. And, you used random start point for the initial value. SIGH. Garbage in, garbage out.

WHen you start with a REALLY bad inital value for the coefficients, then the optimizer see large residuals to sttart with. This helps the optimizer to get lost, diverging. For example, first with only the 10 points you had:

xm = linspace(0,4*pi,10);
ym = sin(xm);
format short
p0 = rand(1,8)
p0 = 1×8
    0.0038    0.5609    0.2456    0.7997    0.1813    0.6704    0.9381    0.0775

These are terrible start points. RANDOM START POINTS ARE OFTEN A BAD IDEA.

% far simpler than using a function that you wrote. LEARN TO USE EXISTING CODE!!!!
yp = @(P) polyval(P,xm);

Before we do a fit, look at what yp returns:

yp(p0)
ans = 1×10
1.0e+06 *

    0.0000    0.0000    0.0004    0.0037    0.0198    0.0741    0.2190    0.5504    1.2265    2.4921

Do you see the initial polynomial is complete CRAPOLA? Any random polynomial will be that. Remember what I said. You would expect this to be true. These are TERRIBLY poor starting values for a polynomial fit. So the solver can get lost.

% define objective function (scaled sum of squared errors)
objective = @(p) sum((yp(p)-ym).^2);
disp(['Initial Objective: ' num2str(objective(p0))])
Initial Objective: 8071607107421.342

Do you see that the initial objective function is HUGE at the start point?

You have too little data. You have terrible start points. Any optimizer will have problems.

% optimize with fmincon
%[X,FVAL,EXITFLAG,OUTPUT,LAMBDA,GRAD,HESSIAN]
options = optimoptions(@fminunc,'Display','iter','Algorithm','quasi-newton');
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,p0,options);
                                                        First-order 
 Iteration  Func-count       f(x)        Step-size       optimality
     0           9      8.07161e+12                      3.11e+14
     1          36      3.29679e+10    1.65488e-16       1.05e+11  
     2          81        2.038e+10           7381       3.12e+11  
     3          90      2.65269e+06              1       8.34e+09  
     4          99      2.64701e+06              1       4.62e+07  

Computing finite-difference Hessian using objective function.

Local minimum found.

Optimization completed because the size of the gradient is less than
the value of the optimality tolerance.
format short g
popt
popt = 1×8
    0.0023033    -0.044804       0.1362      0.78465      0.17943      0.67017      0.93804     0.077495
fval
fval = 
    2.647e+06

So even at the final point, the objective is relatively huge. Remember that fminunc is NOT a least squares solver, but a general optimizer. So that too is an issue.

Perhaps a better choice of starting values would simply be:

Myp0 = 1./factorial(7:-1:0).*repmat([1 .01],1,4)
Myp0 = 1×8
   0.00019841   1.3889e-05    0.0083333   0.00041667      0.16667        0.005            1         0.01
yp(Myp0)
ans = 1×10
1.0e+00 *

         0.01       1.9177       8.1808       31.975       114.53       365.12       1032.1       2615.6       6031.7        12839

These are at least a little smaller numbers, so the solver may have an easier time of it. But what are the default values for the tolerance? TolFun for fminunc is 1e-6. So once the objective function has been reduced from 1e15 to 1e6, and the gradient is now relatively small, fminunc thinks it is done.

options.TolFun = 1e-13; % since we are so far away initially, we will need a massive reduction.
[popt,fval,exitflag,output,grad,hessian]=fminunc(objective,Myp0,options);
                                                        First-order 
 Iteration  Func-count       f(x)        Step-size       optimality
     0           9      2.09284e+08                      1.58e+12
     1          54           369657    1.65485e-16       3.47e+08  
     2          72           369479             10       5.61e+08  

Computing finite-difference Hessian using objective function.

Local minimum possible.

fminunc stopped because the size of the current step is less than
the value of the step size tolerance.
popt
popt = 1×8
  -6.3503e-05   -8.185e-06    0.0083315   0.00041651      0.16667        0.005            1         0.01
fval
fval = 
   3.6948e+05

Still crap. But this may just be getting at the idea that fminunc is not designed to solve that class of problem, at least not without problems of its own.

fplot(@(x) polyval(popt,x),'r',[0,4*pi'])

hold on

fplot(@sin,'b',[0,4*pi])

plot(xm,ym,'go')

legend('Fminunc approximation to sin(x)','sin(x)','data')

hold off

It fit the first couple of points ok, but then things got out of hand. Poor starting values are a problem, and mine were not much better. It looks like the solver starts to effectively chase the last data point, because that is the most important with the largest residual from the starting values. But then it gets lost.

5 Comments
Show 3 older commentsHide 3 older comments

zhou caiwei on 21 Aug 2022

I am surprised by that too.

zhou caiwei on 21 Aug 2022

@John D'Errico @Torsten really knowledgeable answer，I need to go back and review numerical analysis and numerical optimization.

Sign in to comment.

Answer 3

Torsten on 20 Aug 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1782490-fminunc-converging-at-a-strange-point#answer_1029755

Edited: Torsten on 20 Aug 2022

Open in MATLAB Online

xm = linspace(0,4*pi,10).';

ym = sin(xm);

% initial parameter guess

p0 = rand(1,8);

% define objective function (scaled sum of squared errors)

yp = @(p) sum(p.*xm.^(0:7),2);

objective = @(p) yp(p)-ym;

options = optimset('TolFun',1e-8,'TolX',1e-8);

[p1,res] = lsqnonlin(objective,p0,[],[],options)

Local minimum possible. lsqnonlin stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.

p1 = 1×8

0.0002 -0.1141 1.9084 -1.3808 0.3702 -0.0464 0.0028 -0.0001

res = 0.0019

y1 = yp(p1);

p2 = polyfit(xm,ym,7);

p2flip = fliplr(p2)

p2flip = 1×8

0.0002 -0.1141 1.9084 -1.3808 0.3702 -0.0464 0.0028 -0.0001

y2 = polyval(p2,xm);

plot(xm,ym)

hold on

plot(xm,y1)

hold on

plot(xm,y2)

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 4

zhou caiwei on 21 Aug 2022

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1782490-fminunc-converging-at-a-strange-point#answer_1029885

Wow，MATLAB is really broad and profound. Thank you all for your inspiring discussions. Maybe polyfit works because it use Legendre expansion hah. I draw two conclusions from your answer:1. MATLAB has a variety of solutions, if one of them not work, try others until the problem is solved;2.MATLAB is complex，details make a difference

1 Comment
Show -1 older commentsHide -1 older comments

zhou caiwei on 21 Aug 2022

Thanks again！

Sign in to comment.

fminunc Converging at a strange point

3 Comments
Show 1 older commentHide 1 older comment

Accepted Answer

2 Comments
Show NoneHide None

More Answers (3)

5 Comments
Show 3 older commentsHide 3 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

fminunc Converging at a strange point

3 Comments Show 1 older commentHide 1 older comment

Accepted Answer

2 Comments Show NoneHide None

More Answers (3)

5 Comments Show 3 older commentsHide 3 older comments

0 Comments Show -2 older commentsHide -2 older comments

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

3 Comments
Show 1 older commentHide 1 older comment

2 Comments
Show NoneHide None

5 Comments
Show 3 older commentsHide 3 older comments

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments