fmincon check gradients fail

Question

mahmoud tarek on 14 Mar 2020

0
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/510863-fmincon-check-gradients-fail

Edited: Matt J on 26 Mar 2020

I'm trying to provide vectorized gradients for my constraints of my 'nonlincon' function to reduce the run time.

My optimization problem has 11 variables from x1 to x11 and my 'nonlincon' calculates 6 constraints so my gradient matrix is 11 by 6

i tried to mimic how fmincon calculates delta for finite differences.

but when i use check gradients with gradient constraint options the check gradients fails no matter what value i use for finite difference step size.

is there something wrong with how i am calculating the gradients or my delta?

this is the options i used with fmincon

  options_fmincon = optimoptions('fmincon','ConstraintTolerance', 0.001,'Diagnostics','on','Display','iter-detailed','PlotFcn',{@optimplotx,@optimplotfunccount,@optimplotfval,@optimplotconstrviolation,@optimplotstepsize,@optimplotfirstorderopt},'OptimalityTolerance',0.001,'SpecifyConstraintGradient',true,'CheckGradients',true,'FiniteDifferenceStepSize',0.01,'TypicalX',ones(1,11));

The code is like this :

% fmincon Nonlinear constraints function
function [C, CEQ, Jacobian, DCEQ] = NonLinCon(X, LUT, IN_SPEC)
 DOF.M(1).A = X(1);
 DOF.M(2).A = X(2);
 DOF.M(3).A = X(3);
 DOF.M(4).A = X(4);
 DOF.M(5).A = X(5);
 DOF.M(1).B = X(6);
 DOF.M(2).B = X(7);
 DOF.M(3).B = X(8);
 DOF.M(4).B = X(9);
 DOF.M(5).B = X(10);
 DOF.IB = X(11)*1e-3;
DOF.C = IN_SPEC.C;          
DOF.D = IN_SPEC.D;
DOF.E = IN_SPEC.E;
DOF.F = IN_SPEC.F;
DOF.G = IN_SPEC.G;
DOF.H = IN_SPEC.H;
DOF.M(6).A = DOF.M(5).A;
DOF.M(6).B = DOF.M(5).B;
[~, OUT_SPEC] = calculatespecs(LUT,DOF);
if nargout > 2
    T_X = ones(1,11);
   v = ones(1,11);
   delta = v.*sign(X).*max(abs(X),T_X);
        Xp = X;
        DOF_p = DOF;
 DOF_p.M(1).A = [Xp(1) + delta(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1)];
 DOF_p.M(2).A = [Xp(2), Xp(2) + delta(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2)];
 DOF_p.M(3).A = [Xp(3), Xp(3), Xp(3) + delta(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3)];
 DOF_p.M(4).A = [Xp(4), Xp(4), Xp(4), Xp(4) + delta(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4)];
 DOF_p.M(5).A = [Xp(5), Xp(5), Xp(5), Xp(5), Xp(5) + delta(5), Xp(5), Xp(5), Xp(5), Xp(5), Xp(5), Xp(5)];
 DOF_p.M(1).B = [Xp(6), Xp(6), Xp(6), Xp(6), Xp(6), Xp(6) + delta(6), Xp(6), Xp(6), Xp(6), Xp(6), Xp(6)];
 DOF_p.M(2).B = [Xp(7), Xp(7), Xp(7), Xp(7), Xp(7), Xp(7), Xp(7) + delta(7), Xp(7), Xp(7), Xp(7), Xp(7)];
 DOF_p.M(3).B = [Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8) + delta(8), Xp(8), Xp(8), Xp(8)];
 DOF_p.M(4).B = [Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9) + delta(9), Xp(9), Xp(9)];
 DOF_p.M(5).B = [Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10) + delta(10), Xp(10)];
 DOF_p.IB = [Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3 + delta(11)];
 
 DOF_p.M(6).A = DOF_p.M(5).A;
 DOF_p.M(6).B = DOF_p.M(5).B;
        
        [~, OUT_SPEC_p] = calculatespecs(LUT,DOF_p);
       
        Jacobian(:,1) = abs((double(OUT_SPEC_p.A) - double(OUT_SPEC.A)) ./ delta);
        Jacobian(:,2) = abs((double(OUT_SPEC_p.B) - double(OUT_SPEC.B)) ./ delta);
        Jacobian(:,3) = abs((log10(double(OUT_SPEC_p.C)) - log10(double(OUT_SPEC.C))) ./ delta);
        Jacobian(:,4) = abs((double(OUT_SPEC_p.D) - double(OUT_SPEC.D)) ./ delta);
        Jacobian(:,5) = abs((double(OUT_SPEC_p.E) - double(OUT_SPEC.E)) ./ delta);
        Jacobian(:,6) = abs((double(OUT_SPEC_p.F) - double(OUT_SPEC.F)) ./ delta);
    DCEQ = [];
end
C(1) = IN_SPEC.A ./ double(OUT_SPEC.A) - 1;
C(2) = IN_SPEC.B ./ double(OUT_SPEC.B) - 1;
C(3) = log10(IN_SPEC.C) ./ log10(double(OUT_SPEC.C)) - 1;
C(4) = IN_SPEC.D ./ double(OUT_SPEC.D) - 1;
C(5) = IN_SPEC.E ./ double(OUT_SPEC.E) - 1;
C(6) = IN_SPEC.F ./ double(OUT_SPEC.F) - 1;
CEQ = [];
    
end
        
       

4 Comments
Show 2 older commentsHide 2 older comments

mahmoud tarek on 26 Mar 2020

I tried to use tic toc at the start and end of my NonLinCon and Obj functions to calculate the time of one call and this is what i got:

First for NonLinCon with only 2 output arguments:

The function is called multiple times per iteration. The call time varies between 0.010 to 0.2 seconds.

Second for NonLinCon with 4 output arguments:

The call time varies between 0.01 to 0.06 seconds.

My Obj function has no gradient function.

The run time of my Obj function with one output argument varies between 0.000002 to 0.000759 seconds

Matt J on 26 Mar 2020

Edited: Matt J on 26 Mar 2020

The call time varies between 0.010 to 0.2 seconds.

Don't you mean 0.01 to 0.02?

The function is called multiple times per iteration

You need to count how many times in an iteration it calls wtih 2 arguments (when it doesn't need the gradient) and how many times it calls with 4 (I assume once per iteration). If NonLinCon gets called 20 times per iteration with just 2 arguments, then accelerating the gradient calculation won't have much of an impact, because the gradient calculation represents a small fraction of the computation per iteration.

It may be helpful to move your finite difference operations to a separate function, called from within NonLinCon. You can then use the profiler

https://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html

to see how much times is spent in gradient calculation versus other parts of NonLinCon.

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 14 Mar 2020

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/510863-fmincon-check-gradients-fail#answer_420142

Edited: Matt J on 14 Mar 2020

CheckGradient uses central differences, whereas you appear to be using right handed differences. Are you certain that your constraints are differentiable?

3 Comments
Show 1 older commentHide 1 older comment

mahmoud tarek on 17 Mar 2020

Open in MATLAB Online

i tried to calculate the gradients using central differences to match check gradients but still fails

T_X = ones(1,11);
   v = 0.01*ones(1,11);                       % FiniteDifferenceStepSize
   delta = v.*max(abs(X),T_X);           % Delta for central finite differences
Xp = X;
        DOF_p = DOF;
 DOF_p.M(1).A = [Xp(1) + delta(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1), Xp(1)];
 DOF_p.M(2).A = [Xp(2), Xp(2) + delta(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2), Xp(2)];
 DOF_p.M(3).A = [Xp(3), Xp(3), Xp(3) + delta(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3), Xp(3)];
 DOF_p.M(4).A = [Xp(4), Xp(4), Xp(4), Xp(4) + delta(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4), Xp(4)];
 DOF_p.M(5).A = [Xp(5), Xp(5), Xp(5), Xp(5), Xp(5) + delta(5), Xp(5), Xp(5), Xp(5), Xp(5), Xp(5), Xp(5)];
 DOF_p.M(1).B = [Xp(6), Xp(6), Xp(6), Xp(6), Xp(6), Xp(6) + delta(6), Xp(6), Xp(6), Xp(6), Xp(6), Xp(6)];
 DOF_p.M(2).B = [Xp(7), Xp(7), Xp(7), Xp(7), Xp(7), Xp(7), Xp(7) + delta(7), Xp(7), Xp(7), Xp(7), Xp(7)];
 DOF_p.M(3).B = [Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8), Xp(8) + delta(8), Xp(8), Xp(8), Xp(8)];
 DOF_p.M(4).B = [Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9), Xp(9) + delta(9), Xp(9), Xp(9)];
 DOF_p.M(5).B = [Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10), Xp(10) + delta(10), Xp(10)];
 DOF_p.IB = [Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3, Xp(11)*1e-3 + delta(11)];
 DOF_p.M(6).A = DOF_p.M(5).L;
 DOF_p.M(6).B = DOF_p.M(5).RHO;
        
        [~, OUT_SPEC_p] = calculatespecs(LUT,DOF_p);
        
 Xn = X;
        DOF_n = DOF;
 DOF_n.M(1).A = [Xn(1) - delta(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1), Xn(1)];
 DOF_n.M(2).A = [Xn(2), Xn(2) - delta(2), Xn(2), Xn(2), Xn(2), Xn(2), Xn(2), Xn(2), Xn(2), Xn(2), Xn(2)];
 DOF_n.M(3).A = [Xn(3), Xn(3), Xn(3) - delta(3), Xn(3), Xn(3), Xn(3), Xn(3), Xn(3), Xn(3), Xn(3), Xn(3)];
 DOF_n.M(4).A = [Xn(4), Xn(4), Xn(4), Xn(4) - delta(4), Xn(4), Xn(4), Xn(4), Xn(4), Xn(4), Xn(4), Xn(4)];
 DOF_n.M(5).A = [Xn(5), Xn(5), Xn(5), Xn(5), Xn(5) - delta(5), Xn(5), Xn(5), Xn(5), Xn(5), Xn(5), Xn(5)];
 DOF_n.M(1).B = [Xn(6), Xn(6), Xn(6), Xn(6), Xn(6), Xn(6) - delta(6), Xn(6), Xn(6), Xn(6), Xn(6), Xn(6)];
 DOF_n.M(2).B = [Xn(7), Xn(7), Xn(7), Xn(7), Xn(7), Xn(7), Xn(7) - delta(7), Xn(7), Xn(7), Xn(7), Xn(7)];
 DOF_n.M(3).B = [Xn(8), Xn(8), Xn(8), Xn(8), Xn(8), Xn(8), Xn(8), Xn(8) - delta(8), Xn(8), Xn(8), Xn(8)];
 DOF_n.M(4).B = [Xn(9), Xn(9), Xn(9), Xn(9), Xn(9), Xn(9), Xn(9), Xn(9), Xn(9) - delta(9), Xn(9), Xn(9)];
 DOF_n.M(5).B = [Xn(10), Xn(10), Xn(10), Xn(10), Xn(10), Xn(10), Xn(10), Xn(10), Xn(10), Xn(10) - delta(10), Xn(10)];
 DOF_n.IB = [Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3, Xn(11)*1e-3 - delta(11)];
 DOF_n.M(6).A = DOF_n.M(5).A;
 DOF_n.M(6).B = DOF_n.M(5).B;
        
        [~, OUT_SPEC_n] = calculatespecs(LUT,DOF_n); 
 
        %%%%%%%%%%%  Jacobian using central finite differences   %%%%%%%%%%%%%
        Jacobian(:,1) = (double(OUT_SPEC_p.A) - double(OUT_SPEC_n.A)) ./ (2.*delta);
        Jacobian(:,2) = (double(OUT_SPEC_p.B) - double(OUT_SPEC_n.B)) ./ (2.*delta);
        Jacobian(:,3) = (log10(double(OUT_SPEC_p.C)) - log10(double(OUT_SPEC_n.C))) ./ (2.*delta);
        Jacobian(:,4) = (double(OUT_SPEC_p.D) - double(OUT_SPEC_n.D)) ./ (2.*delta);
        Jacobian(:,5) = (double(OUT_SPEC_p.E) - double(OUT_SPEC_n.E)) ./ (2.*delta);
       Jacobian(:,6) = (double(OUT_SPEC_p.F) - double(OUT_SPEC_n.F)) ./ (2.*delta);

Matt J on 17 Mar 2020

Edited: Matt J on 17 Mar 2020

I recommend plotting your Jacobian elements as a function of delta. They should converge to something as delta gets smaller. The first question is if they do converge. The second question is, if they do converge, how does the limit compare to Matlab's Jacobian calculation result and/or the result of jacobianest (from the link I gave earlier). Do this analysis for all the elements.

Sign in to comment.

fmincon check gradients fail

4 Comments
Show 2 older commentsHide 2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

fmincon check gradients fail

4 Comments Show 2 older commentsHide 2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (0)

See Also

Categories

Tags

Community Treasure Hunt

4 Comments
Show 2 older commentsHide 2 older comments

3 Comments
Show 1 older commentHide 1 older comment