fminunc step size too small

15 views (last 30 days)
Matilda
Matilda on 6 Feb 2025
Edited: Matt J on 7 Feb 2025
Hello Matlab community,
I am using fminunc to fit a function to a set of datapoints. The optimizer stops due to small step size however I have noticed the steps it is taking are way too small. I don't want to set the minimum step size lower than 1e-3 but if I do so it just stops immediately.
The solution I have found is to provide the function with my own calculated gradient and manually multiplying it by 1e6.
Is there an optimizer option that can rescale the step size to gradient ratio in a similar way?
My optimizer options are:
options = optimoptions('fminunc', ...
'Display', 'iter', ...
'Algorithm', 'trust-region', ...
'HessianFcn','objective', ...
'SpecifyObjectiveGradient',true, ...
'StepTolerance', 5e-3, ...,
"FiniteDifferenceStepSize", 0.1, ...,
"FunctionTolerance",0.00001, ...
"OptimalityTolerance",2e-6/factor, ...
'MaxFunctionEvaluations', 10000, ...
'MaxIterations', 10000, ...
'OutputFcn', @saveIterations);
Thank you in advance for your help!

Accepted Answer

Matt J
Matt J on 6 Feb 2025
Edited: Matt J on 6 Feb 2025
It's not clear to me whether you are are talking about the StepTolerance or the FiniteDifferenceStepSize, both of which you have set to unusally large values. The StepTolerance has no effect on how large a step will be taken by fminunc at each iteration. It only tells fminunc when to stop.
The FiniteDifferenceStepSize affects how the gradient is approximated. If you find that the optimizer stops immediately with large values of FiniteDifferenceStepSize, it probably means that your objective function is piecewise constant. If your objective f(x) is piecwise constant, then stopping immediately is the appropriate thing for fminunc to do, because virtually every point would be a local minimum. A piecewise constant objective can happen if you are doing rounding, nearest-neighbor interpolation, or any other quantization operation in the calcualtion of f(x). These operations are illegal, and violate the theoretical assumptions of fminunc.
  9 Comments
Matilda
Matilda on 7 Feb 2025
The objective function is consistent between runs of fminunc, but it is piecewise constant due to a thresholding operation. The scale at which it is piecewise constant is very small and threfore limiting the step size to a larger number would overcome this issue but then the optimizer stops too early. I suspect this is due to a small gradient.
I am assuming that the gradient magnitude sets the step size of the optimizer and I believe the relationship between the two is not working in my case and I would like the gradient to give a bigger step size.
Alternatively I am looking for a optimizer that is better than fminunc for my particular situation.
Thank you all for your help!
Matt J
Matt J on 7 Feb 2025
Edited: Matt J on 7 Feb 2025
The scale at which it is piecewise constant is very small and threfore limiting the step size to a larger number would overcome this issue but then the optimizer stops too early. I suspect this is due to a small gradient.
The OptimalityTolerance parameter determines at what gradient magnitude the iterations will stop so you could make it smaller. As I said before, your StepTolerance, and in fact all of your tolerance selections, look unusually large. Remember, there is no connection between the FiniteDifferenceStepSize and any of the tolerance parameters. They are independent things, so making the stepsize large doesn't mean the stopping tolerances should be made large as well. I usually set,
FunctionTolerance=StepTolerance=OptimalityTolerance = 1e-12
....but it is piecewise constant due to a thresholding operation. The scale at which it is piecewise constant is very small and threfore limiting the step size to a larger number would overcome this issue
Perhaps, but I think the better strategy woudl be to find a differentiable alternative for all the discontinuous operations you might be doing in your objective function. The thresholding operation could be approximated by a sigmoid or arctan() function, for example.

Sign in to comment.

More Answers (1)

Catalytic
Catalytic on 6 Feb 2025
Edited: Catalytic on 6 Feb 2025
The solution I have found is to provide the function with my own calculated gradient and manually multiplying it by 1e6.
That might mean your initial point is poorly chosen. In particular, it is chosen in a region where the objective function is almost, but not perfectly flat. This is similar to what is happening in the following example. Notice that the gradient is never exactly zero anywhere except at the global minimum x=0, but for numerical purposes, it may as well be. The optimization cannot move until the initial point is chosen in an area where it has substantial magnitude -
fun=@(x)1-exp(-x.^2/2);
opts=optimoptions('fminunc','Display','none');
x1 = fminunc(fun,3000,opts)
x1 = 3000
x2 = fminunc(fun,30,opts)
x2 = 30
x3 = fminunc(fun,3,opts)
x3 = 3.5861e-09
  4 Comments
Sam Chak
Sam Chak on 7 Feb 2025
Could you provide the data points or at least a portion of the data that exhibits the piecewise constant behavior? This may enable us to investigate whether your preferred fminunc() optimizer is effective.
Matt J
Matt J on 7 Feb 2025
What I think the issue might be is the fact that the gradient gets much less steep around the minimum creating a kind of minimum valley.
If so, you have an ill-conditioned problem (essentially a case of non-unique solutions), and need to regularize it. No algorithm can be guaranteed to reach a particular solution when a problem is ill-conditioned. You can also expect different results for different x0 and on different CPUs.

Sign in to comment.

Products


Release

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!