Steps between iterations too large

4 views (last 30 days)
Martin Pott
Martin Pott on 26 Nov 2014
Edited: Matt J on 27 Nov 2014
I use the fminunc function (active-set algorithm) in Matlab to optimize a function. I expect the values of the optimized parameters to be somewhere between -10 and 10 for all values in the vector theta (a vector of 6 parameters) that is optimized.
I suspect that the optimizer should take really small steps in order to find a local minimum, however, it takes very large steps and the vector ends up with values in the tens of thousands.
Is there any way to reduce this step size, or is there an optimization algorithm that is better suited for my problem, that does not require the user to provide a hessian or gradient?
I have already tried using the fmincon function, with constraints at the -10 and 10 bounds. However this yields results that are at the bounds for almost all cases. Furthermore, I would like to determine the standard errors by using a resampling technique, which makes it impossible to deal with values that are at the bounds (if I'm correct).
Thank you in advance,
Martin
  1 Comment
Matt J
Matt J on 26 Nov 2014
Furthermore, I would like to determine the standard errors by using a resampling technique, which makes it impossible to deal with values that are at the bounds (if I'm correct).
We discussed with you ways to extend standard error calculations to bounded regions here,

Sign in to comment.

Answers (1)

Matt J
Matt J on 26 Nov 2014
Edited: Matt J on 26 Nov 2014
It sounds like you've simply coded your objective function incorrectly, such that the unconstrained minimum is at infinity. Make sure you're not accidentally maximizing the function you are trying to minimize. I.e., maybe the objective simply needs to be multiplied by -1.
As a check, you could also sample your function on a coarse 6-dimensional grid, e.g.,
[p{1:6}]=ndgrid(linspace(-10,10,10));
and see whether the minimum over these samples is in the interior of the ranges [-10,10], as you exepct.
  2 Comments
Martin Pott
Martin Pott on 26 Nov 2014
Dear Matt,
Thanks for the quick response. However, I am not really sure what I need to do with the coarse 6-dimensional grid. Do I need to paste the command and look directly at the values? They look something like this (but then for a lot more matrices):
val(:,:,1,1,1,1) =
-10 -10 -10 -10 -10
-10 -10 -10 -10 -10
-10 -10 -10 -10 -10
-10 -10 -10 -10 -10
-10 -10 -10 -10 -10
val(:,:,2,1,1,1) =
-5 -5 -5 -5 -5
-5 -5 -5 -5 -5
-5 -5 -5 -5 -5
-5 -5 -5 -5 -5
-5 -5 -5 -5 -5
I am actually maximizing the function, so I already multiply the f-value by a negative number to make sure the function is maximized.
Perhaps I should give some more information on the context of my problem. I use the optimized values in a formula to calculate a matrix of weights (p_w). Originally, these weights can be both positive and negative. For these cases the theta-values are as I would expect (somewhere between -10 and 10).
However, I also want to determine the optimal weights when only weights of zero and higher are allowed. For that, I use the following function, which also makes sure the sum of all weights is 1 for each row:
temp = sum(max(0,p_w),2);
p_w_plus = max(0,p_w)./repmat(temp, 1, 495);
Where p_w is the weight-matrix with dimension 600x495.
Imposing this altercation to the weight matrix is when the issues of the extreme values arise.
Thanks again for your help.
Matt J
Matt J on 26 Nov 2014
Edited: Matt J on 27 Nov 2014
However, I am not really sure what I need to do with the coarse 6-dimensional grid.
You can now do
p=reshape(cat(7,p{:}),[],6)
Each row of p is the coordinate of a point on the grid. You can loop through these points, evaluating your objective function at each. Then locate the minimum over all points.
For that, I use the following function, which also makes sure the sum of all weights is 1 for each row:
You mean the objective function depends on theta through p_w_plus? If so, the objective function is probably not differentiable, since f(z) = max(0,z) is not. That could account for the dysfunctional step sizes. You have infinite/undefined Hessians in places. FMINUNC assumes a twice continuously differentiable function.
You also require this because you are using the Hessian for other things. You need to ensure that the Hessian exists.
You might try a different (and differentiable) positivity transform, e.g.,
temp=p_w.^2;
p_w_plus=bsxfun(@rdivide, temp, sum(temp,2)); %row sum =1
Your strategy of normalizing the row sum might be a bit of a problem, however. It could allow many theta to produce the same p_w_plus, leading to a singular Hessian. I'm not sure if the standard error estimates are valid in that case.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!