You are now following this question
- You will see updates in your followed content feed.
- You may receive emails, depending on your communication preferences.
Can I disable intermediate calculations for fmincon
5 views (last 30 days)
Show older comments
In the optimization fmincon, there is always a lot of the 'intermediate calculations' (can be referred to https://www.mathworks.com/help/optim/ug/iterations-and-function-counts.html#mw_dc044841-a6b6-43c0-8b29-0af2fbbcb66c) that increase the function counts during optimization. In the link it says that the "intermediate calculations can involve evaluating the objective function and any constraints at points near the current iterate x_i. For example, the solver might estimate a gradient by finite differences."
What is the purpose of these intermediate calculations? Since I have provided the gradient calculation for my objective function, why would the optimizer need to calculate the finite difference gradient? My example objective function does not have any constaints.
My example code and the output for optimization are shown below. From the output, the 'Iter' term and 'F-count' term show that there are many intermediate calculations involved.
If the calculations for the objective and the gradient are expensive, the intermediate calculation can take a lot of time.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter');
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
[x,f] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
1 7 1.028667e+02 0.000e+00 6.444e+02 7.071e-01
2 9 8.035769e+01 0.000e+00 4.797e+02 7.071e-01
3 10 6.132722e+00 0.000e+00 7.816e+00 1.155e+00
4 11 6.065238e+00 0.000e+00 5.189e+00 3.681e-02
5 12 5.678075e+00 0.000e+00 6.330e+00 2.437e-01
6 14 5.112684e+00 0.000e+00 3.119e+01 5.825e-01
7 15 4.769085e+00 0.000e+00 2.229e+01 8.987e-02
8 16 4.630101e+00 0.000e+00 4.064e+01 6.700e-01
9 17 3.708221e+00 0.000e+00 7.080e+00 1.786e-01
10 18 3.175089e+00 0.000e+00 7.950e+00 2.845e-01
11 20 3.165815e+00 0.000e+00 2.115e+01 2.951e-01
12 21 2.899436e+00 0.000e+00 9.888e+00 1.282e-01
13 22 2.725372e+00 0.000e+00 7.164e+00 6.340e-02
14 23 2.382814e+00 0.000e+00 1.316e+01 3.822e-01
15 24 2.129017e+00 0.000e+00 4.236e+00 1.134e-01
16 25 1.874512e+00 0.000e+00 4.274e+00 1.216e-01
17 27 1.784218e+00 0.000e+00 1.310e+01 2.576e-01
18 28 1.522263e+00 0.000e+00 3.092e+00 1.092e-01
19 29 1.353081e+00 0.000e+00 2.848e+00 7.749e-02
20 31 1.178302e+00 0.000e+00 8.209e+00 1.660e-01
21 32 1.014260e+00 0.000e+00 3.229e+00 2.716e-02
22 33 7.723798e-01 0.000e+00 5.463e+00 1.596e-01
23 34 6.002908e-01 0.000e+00 3.264e+00 8.887e-02
24 35 4.638434e-01 0.000e+00 4.710e+00 1.346e-01
25 36 2.907823e-01 0.000e+00 5.478e+00 2.276e-01
26 39 1.881240e-01 0.000e+00 6.312e+00 1.949e-01
27 40 1.729287e-01 0.000e+00 1.782e+00 9.193e-02
28 41 1.396410e-01 0.000e+00 8.969e-01 6.110e-02
29 43 1.223560e-01 0.000e+00 2.758e+00 6.740e-02
30 44 1.073474e-01 0.000e+00 2.984e+00 4.174e-02
First-order Norm of
Iter F-count f(x) Feasibility optimality step
31 45 5.633254e-02 0.000e+00 1.855e+00 1.399e-01
32 46 3.253577e-02 0.000e+00 1.257e+00 9.971e-02
33 47 1.470709e-02 0.000e+00 1.393e+00 1.220e-01
34 48 1.418260e-02 0.000e+00 3.657e+00 9.564e-02
35 50 2.088770e-04 0.000e+00 4.220e-01 1.271e-01
36 51 1.699139e-04 0.000e+00 4.750e-02 7.371e-03
37 52 6.403872e-05 0.000e+00 7.905e-02 1.168e-02
38 53 7.152289e-06 0.000e+00 9.437e-02 1.448e-02
39 54 3.937940e-07 0.000e+00 2.330e-02 2.254e-03
40 55 1.379737e-10 0.000e+00 6.095e-05 4.873e-04
41 56 3.901588e-14 0.000e+00 1.103e-06 2.559e-05
42 57 1.179907e-20 0.000e+00 4.271e-09 4.383e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
Accepted Answer
Matt J
on 24 Jun 2022
Edited: Matt J
on 24 Jun 2022
Since you are specifying the objective gradient, finite difference calculations will not be executed for that particular piece of the iteration loop. However, if second derivatives are needed and you haven't provided a Hessian calculation, finite differences will still be need for that. Also, multiple function evaluations may still be necessary, depending on the algorithm, for things like line searches.
15 Comments
Fangning Zheng
on 24 Jun 2022
Can interior-point algorithm not use hessian? Or is there any other algorithm that does not require the hessian calculation? Thank you!
Torsten
on 24 Jun 2022
All fmincon algorithms use the Hessian.
If you want an algorithm without Hessian, you could program the "steepest descent method" on your own.
Matt J
on 24 Jun 2022
You're quite welcome, but please Accept-click the answer if you consider the matter closed.
Matt J
on 24 Jun 2022
I don't think the extra evaluations in your case are coming from finite differences approximations to the Hessian. The default interior-point algorithm settings use the gradient only. The extra evaluations are likely coming from line searches.
Also, you do not need to implement the steepest descent algorithm on your own. This example shows how to do it with fminunc,
However, steepest descent, or any algorithm that does not use at least an approximation to the Hessian, tends to perform badly, so it is not recommended.
Fangning Zheng
on 24 Jun 2022
If the extra evaluations are coming from line searches, then gradient is not required? Since what I understood for line search is that once the gradient is calculated, the algorithm use it to calculate the descend direction p_k. Then it use line search to find a step size alpha_k so that f(x_k + alpha _k * p_k) < f_k. In this case, gradient is not required. If we provide the gradient calculation in the same script of the objective function, gradient calculation is wasted?
Torsten
on 24 Jun 2022
The gradient is needed in every call. This seems to indicate that the objective function is not called to make a line search.
Why do you specify "FiniteDifferenceStepSize" = 100 ???? It has to be in the order of 1e-8.
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter','FiniteDifferenceStepSize',100);
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
[x,f] = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
ans = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
ans = 2
ans = 2
ans = 2
ans = 2
ans = 2
ans = 2
1 7 1.028667e+02 0.000e+00 6.444e+02 7.071e-01
ans = 2
ans = 2
2 9 8.035769e+01 0.000e+00 4.797e+02 7.071e-01
ans = 2
3 10 6.132722e+00 0.000e+00 7.816e+00 1.155e+00
ans = 2
4 11 6.065238e+00 0.000e+00 5.189e+00 3.681e-02
ans = 2
5 12 5.678075e+00 0.000e+00 6.330e+00 2.437e-01
ans = 2
ans = 2
6 14 5.112684e+00 0.000e+00 3.119e+01 5.825e-01
ans = 2
7 15 4.769085e+00 0.000e+00 2.229e+01 8.987e-02
ans = 2
8 16 4.630101e+00 0.000e+00 4.064e+01 6.700e-01
ans = 2
9 17 3.708221e+00 0.000e+00 7.080e+00 1.786e-01
ans = 2
10 18 3.175089e+00 0.000e+00 7.950e+00 2.845e-01
ans = 2
ans = 2
11 20 3.165815e+00 0.000e+00 2.115e+01 2.951e-01
ans = 2
12 21 2.899436e+00 0.000e+00 9.888e+00 1.282e-01
ans = 2
13 22 2.725372e+00 0.000e+00 7.164e+00 6.340e-02
ans = 2
14 23 2.382814e+00 0.000e+00 1.316e+01 3.822e-01
ans = 2
15 24 2.129017e+00 0.000e+00 4.236e+00 1.134e-01
ans = 2
16 25 1.874512e+00 0.000e+00 4.274e+00 1.216e-01
ans = 2
ans = 2
17 27 1.784218e+00 0.000e+00 1.310e+01 2.576e-01
ans = 2
18 28 1.522263e+00 0.000e+00 3.092e+00 1.092e-01
ans = 2
19 29 1.353081e+00 0.000e+00 2.848e+00 7.749e-02
ans = 2
ans = 2
20 31 1.178302e+00 0.000e+00 8.209e+00 1.660e-01
ans = 2
21 32 1.014260e+00 0.000e+00 3.229e+00 2.716e-02
ans = 2
22 33 7.723798e-01 0.000e+00 5.463e+00 1.596e-01
ans = 2
23 34 6.002908e-01 0.000e+00 3.264e+00 8.887e-02
ans = 2
24 35 4.638434e-01 0.000e+00 4.710e+00 1.346e-01
ans = 2
25 36 2.907823e-01 0.000e+00 5.478e+00 2.276e-01
ans = 2
ans = 2
ans = 2
26 39 1.881240e-01 0.000e+00 6.312e+00 1.949e-01
ans = 2
27 40 1.729287e-01 0.000e+00 1.782e+00 9.193e-02
ans = 2
28 41 1.396410e-01 0.000e+00 8.969e-01 6.110e-02
ans = 2
ans = 2
29 43 1.223560e-01 0.000e+00 2.758e+00 6.740e-02
ans = 2
30 44 1.073474e-01 0.000e+00 2.984e+00 4.174e-02
ans = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
31 45 5.633254e-02 0.000e+00 1.855e+00 1.399e-01
ans = 2
32 46 3.253577e-02 0.000e+00 1.257e+00 9.971e-02
ans = 2
33 47 1.470709e-02 0.000e+00 1.393e+00 1.220e-01
ans = 2
34 48 1.418260e-02 0.000e+00 3.657e+00 9.564e-02
ans = 2
ans = 2
35 50 2.088770e-04 0.000e+00 4.220e-01 1.271e-01
ans = 2
36 51 1.699139e-04 0.000e+00 4.750e-02 7.371e-03
ans = 2
37 52 6.403872e-05 0.000e+00 7.905e-02 1.168e-02
ans = 2
38 53 7.152289e-06 0.000e+00 9.437e-02 1.448e-02
ans = 2
39 54 3.937940e-07 0.000e+00 2.330e-02 2.254e-03
ans = 2
40 55 1.379737e-10 0.000e+00 6.095e-05 4.873e-04
ans = 2
41 56 3.901588e-14 0.000e+00 1.103e-06 2.559e-05
ans = 2
42 57 1.179907e-20 0.000e+00 4.271e-09 4.383e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
nargout
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
if nargout > 1 % gradient required
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
Fangning Zheng
on 24 Jun 2022
I was trying to see if there is any differences with different values of "FiniteDifferenceStepSize", you can diable this parameter....
I agree with you that every time it's calling both objective function and gradient calculation. I'm pretty confused with the intermediate calculations and what exactly is it calculating?
Fangning Zheng
on 24 Jun 2022
I print out the fval:
I added the hessian calculation. The total iteration is smaller. But at some iterations the algorithm is still making intermediate calculation with calling gradient. I think it is doing line search but why is it still calling gradient calculation? It's a waste. Is it because it's doing exact line search?
options = optimoptions('fmincon','SpecifyObjectiveGradient',true,'Display',...
'iter','HessianFcn',@hessinterior);
fun = @rosenboth;
x0 = [-1,2];
A = [];
b = [];
Aeq = [];
beq = [];
lb = [];
ub = [];
nonlcon = [];
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options);
f = 104
nargout = 2
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 1 1.040000e+02 0.000e+00 3.960e+02
f = 7.381693e+02
nargout = 2
f = 1.097243e+02
nargout = 2
f = 6.568461e+00
nargout = 2
1 4 6.568461e+00 0.000e+00 5.317e+01 3.536e-01
f = 4.452518e+00
nargout = 2
2 5 4.452518e+00 0.000e+00 1.574e+01 7.071e-01
f = 4.330045e+00
nargout = 2
3 6 4.330045e+00 0.000e+00 3.727e+01 7.771e-01
f = 2.840799e+00
nargout = 2
4 7 2.840799e+00 0.000e+00 4.947e+00 7.604e-02
f = 3.832249e+01
nargout = 2
f = 4.105706e+00
nargout = 2
f = 2.398100e+00
nargout = 2
5 10 2.398100e+00 0.000e+00 1.131e+01 3.305e-01
f = 1.835225e+00
nargout = 2
6 11 1.835225e+00 0.000e+00 5.917e+00 1.914e-01
f = 1.485471e+00
nargout = 2
7 12 1.485471e+00 0.000e+00 1.023e+01 2.588e-01
f = 1.025018e+00
nargout = 2
8 13 1.025018e+00 0.000e+00 2.046e+00 1.030e-01
f = 1.820866e+00
nargout = 2
f = 8.166366e-01
nargout = 2
9 15 8.166366e-01 0.000e+00 6.841e+00 1.713e-01
f = 5.455115e-01
nargout = 2
10 16 5.455115e-01 0.000e+00 2.276e+00 1.271e-01
f = 5.033114e-01
nargout = 2
11 17 5.033114e-01 0.000e+00 9.925e+00 2.588e-01
f = 2.126112e-01
nargout = 2
12 18 2.126112e-01 0.000e+00 4.565e-01 1.061e-01
f = 1.093302e+00
nargout = 2
f = 1.626042e-01
nargout = 2
13 20 1.626042e-01 0.000e+00 6.961e+00 2.376e-01
f = 6.438567e-02
nargout = 2
14 21 6.438567e-02 0.000e+00 4.387e-01 1.038e-01
f = 1.012490e-01
nargout = 2
f = 3.497577e-02
nargout = 2
15 23 3.497577e-02 0.000e+00 2.614e+00 1.589e-01
f = 1.234638e-02
nargout = 2
16 24 1.234638e-02 0.000e+00 1.065e+00 1.239e-01
f = 3.343805e-03
nargout = 2
17 25 3.343805e-03 0.000e+00 1.357e+00 1.291e-01
f = 3.938513e-04
nargout = 2
18 26 3.938513e-04 0.000e+00 2.067e-01 5.722e-02
f = 1.223894e-05
nargout = 2
19 27 1.223894e-05 0.000e+00 1.079e-01 3.746e-02
f = 1.383574e-08
nargout = 2
20 28 1.383574e-08 0.000e+00 1.339e-03 4.663e-03
f = 2.260423e-14
nargout = 2
21 29 2.260423e-14 0.000e+00 4.744e-06 2.514e-04
f = 5.099602e-26
nargout = 2
22 30 5.099602e-26 0.000e+00 2.594e-12 2.046e-07
Local minimum found that satisfies the constraints.
Optimization completed because the objective function is non-decreasing in
feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.
function [f, g] = rosenboth(x)
f = 100*(x(2) - x(1)^2)^2 + (1-x(1))^2;
fprintf('f = %d \n',f)
if nargout > 1 % gradient required
fprintf('nargout = %d \n',nargout)
g = [-400*(x(2)-x(1)^2)*x(1)-2*(1-x(1));
200*(x(2)-x(1)^2)];
end
end
function h = hessinterior(x,lambda)
h = [1200*x(1)^2-400*x(2)+2, -400*x(1);
-400*x(1), 200];
end
Torsten
on 24 Jun 2022
The extra calls are only for the first iteration. Maybe a check whether the Hessian you supplied is correct. I don't know.
But obviously, the intermediate calls in your previous code are due to the calculation of the Hessian because they are absent in the code above.
But why do you care so much about the internals of "fmincon" ? You won't be able to give answers with absolute certainty. Just try the different options and see which are the fastest and/or most reliable for your problem.
Fangning Zheng
on 24 Jun 2022
The extra calls happened at iter 1, 7 for two times and iter 13, 18, 21 for one time.
My own optimization problem requires function call (forward simulation) that the run time is about 1.5 hr. I use finite difference (my own script) to calculate the gradient, the total number of control variables is 45. So every time if it requires gradient calculation, it will require to run 45 forward simulation. I use parallel running of 30 nodes. So the average run time for calculating both objective function and gradient is about 3 hrs for 1 iteration (and this will be much more expensive if adding hessian calculation using FD so I do not use hessian). I cannot really afford the intermediate calculation unless it is needed...
Torsten
on 24 Jun 2022
I cannot really afford the intermediate calculation unless it is needed...
But you cannot change fmincon's behaviour.
and this will be much more expensive if adding hessian calculation using FD so I do not use hessian
Fmincon will get its Hessian - either you supply it or it uses the gradients to get a finite-difference approximation.
Fangning Zheng
on 24 Jun 2022
I think I will switch to gradient descent and test without hessian calculation. Thank you for the answers!
John D'Errico
on 24 Jun 2022
You can. However, you should note that a basic gradient descent will be extremely slowly convergent on many problems.
More Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!An Error Occurred
Unable to complete the action because of changes made to the page. Reload the page to see its updated state.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)