Why trust-region-reflective algorithm in fmincon require a gradient as input?
Show older comments
Hello
According to Matlab fmincon documentation, the 'trust-region-reflective' algorithm needs to specify the objective gradient.
1- Why does it need gradient as input?
2- Does the objective function have to be analytical? since my objective function, is function handle that calculates some scalar as output
Thanks for your help in advance
12 Comments
Torsten
on 25 Feb 2022
The objective function has to be differentiable and return a scalar value.
This is senseful because how do you want to minimize a matrix ?
SM
on 25 Feb 2022
My guess is that for the trust-region-reflective algorithm it is very important that the gradient of the objective function is calculated with high precision. That's why the user is requested to supply it (at best analytically).
Of course, all the other methods also need the gradient of the objective, but it will be calculated within fmincon by numerical differentiation.
But why care about it - you won't be able to alter the requirements MATLAB imposes.
SM
on 25 Feb 2022
I tried to explain the logic behind it. Please read my response more carefully.
The essence is that all algorithms need the gradient of the objective, but that it is vital for the trust-region-reflective algorithm that the gradient is supplied with high precision. That's why the user is explicitly requested to supply it, at best analytically.
Of course you can use TRA also in case you have no analytical derivative of the objective, namely by calculating the gradient of the objective numerically, but I don't know if this is precise enough for the TRA to succeed. Especially because you said that you do not even know if your objective is differentiable. If not, better use ga instead of fmincon.
SM
on 25 Feb 2022
(f(x(1),x(2),...,x(i )+ h,x(i+1),...,x(120))-f(x(1),x(2),....,x(120)))/h
gives you an approximation to the i-th element of the gradient of the objective function f where h is a small number (e.g.1e-8).
There are several MATLAB codes in the File Exchange that calculate numerical gradients of a scalar function (I think Matt J gave a link to one of these codes).
But when using fmincon, make sure that your objective and your constraints are really differentiable with respect to the variables. If not, better use ga.
SM
on 26 Feb 2022
Zaikun Zhang
on 26 Feb 2022
Edited: Zaikun Zhang
on 26 Feb 2022
Walter Roberson
on 26 Feb 2022
Edited: Walter Roberson
on 26 Feb 2022
https://nmayorov.wordpress.com/2015/06/19/trust-region-reflective-algorithm/ has some description of the algorithm.
I do not see anything in there that requires analytic solutions, but I might have missed something.
Bruno Luong
on 26 Feb 2022
"I have 120 variables (x) and trying to minimise my objective function (f). The objective function is calculated numerically at each iteration, and I cannot calculate its gradient since there is no explicit formula."
This statement is simply not true. I have minimizes a function with 1e6 variables with no explicit-formula (solving a PDE) and I can compute the exact gradient by chain rules and adjoint equation. The only requirent is your code that compute the objective function only uses diffentiable functions and operators (so no min, max, if) within it.
Walter Roberson
on 26 Feb 2022
I could imagine situations such as the function involving solving integral equations where no closed form formula exists. For example ODE often have a general solution involving the exp() of the sum of the roots of an integral equation.
Accepted Answer
More Answers (2)
Zaikun Zhang
on 26 Feb 2022
0 votes
Bruno Luong
on 26 Feb 2022
Edited: Bruno Luong
on 26 Feb 2022
0 votes
I don't think Matt's answer is correct, rather Torsen answer is better. The Trust region algorithm must approximate the objective function in a current "trust region". It is usually use the cost function and gradient in many points and create a quadratic function approximating. That requires both cost function and gradient calculation are continuous with respect to the points at which the gradients (and the objectiive) are evaluated. Therefore the numerical finite differences with adaptive step is not suitable, since there is no known method to estimate the step continuously with the point.
Therefore TR algorithm requires user to supply the "exact" gradient. Otherwise it won't converge.
Better still, if user provides the Hessian x vector calculation, it will help the TR to work even more efficiently.
5 Comments
Matt J
on 26 Feb 2022
I don't follow that reasoning. By default, the TR method computes the Hessian using finite differences. I don't see why TR would be tolerant to finite difference approximations to the Hessian, but not the gradient.
Bruno Luong
on 26 Feb 2022
Edited: Bruno Luong
on 26 Feb 2022
They use (L)-BFGS formula to approximate the Hessian from the gradients not finite difference. The difference with other method is that TR have to solve subproblem using the quadratic model (thus the Hessain) and not the descend direction (1D) which is more or less d:=-H*g in many cases. Therefore the requirement of gradient being smooth is more important.
Bruno Luong
on 26 Feb 2022
OK, I miss that.
But my point does not change, if they compute H by finite differences on g, it requres the g to be exact so as H is meaningfuly estimatedl. Bottom line is the TR requires more accuracy on g, not because of the computation time.
Categories
Find more on Solver Outputs and Iterative Display in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!