- You will see updates in your activity feed.
- You may receive emails, depending on your notification preferences.

134 views (last 30 days)

Hi,

First of all, I am using Matlab 2017b and the optimalization toolbox of 2019. I am using quadprog (not suprisingly ;)) to solve a QP problem. In reality, my problem is convex, however I get in the command window a message from quadprog that my problem is non-convex. I am not using lower and upper bounds, but inequality constraints. Furthermore, I checked if I had negative eigenvalues in my Hessian. This was indeed the case. I had one or two really small negative eigenvalues (-1e20). However, I am using quadprog for multiple timesteps, whereby the lineair term f (in my case F) is updated and therefore changes a little bit after every time step. If I am running the code for example for 10 timesteps, I get 8 times the message that the problem is non-convex and two times that the problem is convex and a minimum has been found. This is in my opinion quite weird, because the Hessian stays the same every timestep (and with that the one or two really small negative eigenvalues).

Does anyone have a potential solution to fix this issue or the cause of it? If you think you need more information, please let me know.

tic;

[X,fval,exitflag,output,lambda] = quadprog(G,F,Aineq_SparseCondensed,bineq_SparseCondesed,[],[],[],[],[],options);

toc

G = (G+G')/2;

Thanks in advance!

Kind regards

Sign in to answer this question.

Matt J
on 23 Jan 2020

Edited: Matt J
on 23 Jan 2020

You cannot submit a problem that is borderline convex to quadprog if it is using an algorithm that expects convexity. If you do, quadprog can see the problem as convex or not-convex arbitrarily, depending on floating point numerical errors that essentially contribute random noise to the calculations that it is doing.

I would guess that your Hessian is supposed to be more distinctly positive definite, so you should review the way you set up the problem.

S.
on 24 Jan 2020

Thanks for your fast response.

The problem is not borderline convex, but completely convex in reality. If the size of the Hessian becomes bigger, then the problem starts to occur. Furthermore, if I use for example lower bound and upper bound instead of the inequality constraints the problem is convex, which is strange, because the function of both is almost the same. If I use both lb and ub, and inequality constraints, the solving time is extremely large and the solution wrong.

Do you maybe know a systematic approach I could perform to know what causes this problem (for example to know if numerical issues causes this problem) or settings in quadprog I could try ? Furthermore, how could I get my Hessian more positive definite by just changing the set up of my problem? Could you give an example?

Matt J
on 24 Jan 2020

The problem is not borderline convex, but completely convex in reality...Furthermore, if I use for example lower bound and upper bound instead of the inequality constraints the problem is convex

No, whether a quadratic problem is strongly convex (which you want, so that numerical behavior is good) is dictated entirely by the eigenvalues of your Hessian. Adding or removing bounds and other linear inequality constraints does not influence that.

However, the ability of quadprog to recognize that your problem is strongly convex can be impaired by floating point errors when your Hessian matrix is nearly singular. Moreoever, changing the constraints can conceivably affect the floating point errors that are encountered and what software flags they trigger. For example, under some particular constraints, the algorithm might try to step in a direction that is in the null space of your Hessian. Along such directions, the quadratic objective function should have zero curvature, but due to floating point errors, the curvature that gets calculated by the algorithm may be slightly negative and so it will raise a flag that the problem is non-convex.

Do you maybe know a systematic approach I could perform to know what causes this problem

In addition to checking the Hessian eigenvalues as you have done (they should all be strictly positive), you should check its reciprocal condition number rcond(H). If it is very close to zero, it means H is nearly singular and hence not strongly convex, for numerical purposes.

Furthermore, how could I get my Hessian more positive definite by just changing the set up of my problem? Could you give an example?

Often, a near singular Hessian arises because there aren't enough equations to determine the optimum uniquely, or if there are redundant variables. For example in a linear least squares problem,

the Hessian will be singular if A does not have full column rank. Poor choices of units can also cause this behavior. For example, this quadratic function

is theoretically strongly convex, but numerically, it has a very singular-looking Hessian:

>> H=[1e12, 0; 0 1]*2; rcond(H),

ans =

1.0000e-12

But if I just change the units of x via the change of variables , then the new objective

has a distinctly non-singular Hessian:

>> H=eye(2)*2; rcond(H)

ans =

1

S.
on 11 Feb 2020

Thanks for your response and the clarification. Sorry for my delayed response.

However, I have still some additional questions and will provide you with more information.

In the second answer, you mention that if the reciprocal condition number is very close to zero, it is nearly singular. Could you indicate what ''very close to zero'' is, because it is very subjective? Is there a boundary in which order of magnitude the reciprocal condition number should lay? For example, if the reciprocal condition number is smaller than 1e-10, then the Hessian is nearly singular. Furthermore, is looking at the condition number also useful? If yes, from which order of magnitude it can be said that the matrix is ill-conditioned.

The optimization problem, I am trying to solve:

with zi|k = H*xi|k, ui|k the predicted inputs, xi|k the predicted states.

In this optimal control problem Q and R are both symmetric positive definite matrices, with Q a diagonal matrix (all diagonal entries have the value 100) and R a diagonal matrix (all diagonal entries have the value 0.1). Furthermore, the mean of the vector of z_r is around 5 and the mean of the vector u_r is around 0.1. The predicted inputs are constrained to be between 0 and 10. Therefore, I think the units are fine.

Moreover, the future states could be written as a function of the future inputs and after some substitution the optimization problem looks like:

with xi|k = Z*ui|k + z_k_hat

with Z a banded null-space matrix (should keep a specific structure), Omega a diagonal matrix with transpose(H)*Q*H and R on the diagonal entries, C_1 a vector with -2*transpose(z_r)*Q*H and -2*transpose(u_r)*R, z_k_hat a matrix with [A; A^2; ....;A^N] and A represents the dynamics of the system.

The condition number of Z is 1e7.

Do you have any ideas/suggestions how I can reformulate this model? It is hard from me to know where to start and I think the units are fine.

Matt J
on 11 Feb 2020

S.
on 16 Feb 2020

The condition number are as follows:

For Omega: Inf

For transpose(Z)*Omega*Z: 7.4375e+19

For Z: 5.6311e+08

Z is too big (7.5 mB). How could I send this?

Q, Ω, H and R could be found in Optimization2.zip

Could you still give answer on the first question, please (about reciprocal condtion number and condtion number)?

S.
on 17 Feb 2020

This question:

In the second answer, you mention that if the reciprocal condition number is very close to zero, it is nearly singular. Could you indicate what ''very close to zero'' is, because it is very subjective? Is there a boundary in which order of magnitude the reciprocal condition number should lay? For example, if the reciprocal condition number is smaller than 1e-10, then the Hessian is nearly singular. Furthermore, is looking at the condition number also useful? If yes, from which order of magnitude it can be said that the matrix is ill-conditioned.

I attached now also the zmax an zmin of the Z-matrix. So all the matrices are added.

If you or somebody else could take a look at the matrices and would remark something that could be improved or the cause of the problem, it would be really nice.

Matt J
on 17 Feb 2020

Is there a boundary in which order of magnitude the reciprocal condition number should lay?

You won't be able to indentify a sharp boundary that applies to your situation, but a condition number of less than 1000 would definitely be considered good and a condition number of Inf, like you have in your Omega matrix can definitely be considered a singular matrix. So, your objective function is clearly not strongly convex. Since Omega is a diagonal matrix, I would simply investigate why its maximum diagonal element is so much larger than its minimum diagonal element...

To get a quantitative perspective on the condition number, you should understand that it is the factor by which percent numerical errors get magnified when solving a set of linear equations involving the matrix,

So, if quadprog reaches a step where it must solve a set of linear equations with condition number K and there is percent error delta in the equation data, then the results will have percent error K*delta. Since you have a condition number of K=1e19, you will need percent error in the input data of delta=1e-19 % (which is beyond what double floating point precision can do) in order to have a 1% error in the solution.

S.
on 8 Mar 2020 at 14:54

Hi,

Thanks for your response.

I am also using the Gurobi Optimizer. Gurobi is using the barrier method and quadprog the interior-point method. However, quadprog can handle better ill-conditioned matrices then Gurobi. When quadprog gives me the correct results for small and middle size Hessians, Gurobi gives already bad results (and the warning that the range of the quadratic objective terms is large) for the same QP problem. Only for very large ill-conditioned Hessians quadprog also gives an error that the problem is non-convex. Do you have maybe a reason for this? That despite they use the same algorithm, quadprog can handle the same QP problem better?

S.
on 8 Mar 2020 at 21:48

I tried to make it well-conditioned by scaling the matrices, however it didn't succeed.

Is there more information to find about the interior-point method that quadprog specifically uses (so not general information, but specifically for quadprog)?

S.
on 9 Mar 2020 at 10:51

This is about non-lineair optimization and the fmincon function, not the barrier method?

Is there by the way a relationship between an ill-condtioned Hessian and the iterations to converge?

It seems namely due to the fact the problem is ill-conditioned the solver needs a lot of iteriations to find a global minimum?

Matt J
on 9 Mar 2020 at 12:45

This is about non-lineair optimization and the fmincon function, not the barrier method?

The link I gave you should take you specifically to the section on the interior-point method.

Is there by the way a relationship between an ill-condtioned Hessian and the iterations to converge? It seems namely due to the fact the problem is ill-conditioned the solver needs a lot of iteriations to find a global minimum?

That is certainly true for gradient descent methods (e.g. conjugate gradient, steepest descent). See, for example,

Some methods will automatically rescale the problem so that the condition number is small. For unconstrained Newton's method, for example, it always takes a single iteration to converge. This assumes, however, that the Hessian, while ill-conditioned, is still reasonably non-singular. As we've discussed, your problem is essentially singular to within double precision.

S.
on 11 Mar 2020 at 19:40

Does the interior-point method of quadprog make use of gradient descent methods?

And does quadprog generally use many iterations with low cost/time of each iteration, or few iterations with expensive cost of each iteration (lot of time)? I think the latter one based on my results, but I am not sure. If so, is there an explanation for?

Furthermore, I found the following on the website: ''The algorithm has different implementations for a sparse Hessian matrix H and for a dense matrix. Generally, the sparse implementation is faster on large, sparse problems, and the dense implementation is faster on dense or small problems.''

Why is this the case?

And maybe if you know, what is the difference between interior-point method and GPAD (Gradient Projection method)? I know for example the interior-point method needs the Hessian online to solve an MPC problem and with GPAD this step could be performed offline.

Thanks in advance.

Matt J
on 11 Mar 2020 at 21:52

And does quadprog generally use many iterations with low cost/time of each iteration, or few iterations with expensive cost of each iteration (lot of time)? I think the latter one based on my results, but I am not sure. If so, is there an explanation for?

If there is, I am sure that it is in the algorithm description that I gave you the link for. It should be possible to deduce computational expense of the different steps of the algorithm from that.

The algorithm has different implementations for a sparse Hessian matrix H and for a dense matrix. Generally, the sparse implementation is faster on large, sparse problems, and the dense implementation is faster on dense or small problems.'' Why is this the case?

Matrix arithmetic is faster for large matrices if they are sparse, and if you manipulate them in sparse form, as Matlab's sparse type does.

And maybe if you know, what is the difference between interior-point method and GPAD (Gradient Projection method)?

Well for one thing, GPAD is not confined to travel through the interior of the feasible set as the interior- point method is (hence its name).

S.
on 11 Mar 2020 at 23:19

If there is, I am sure that it is in the algorithm description that I gave you the link for. It should be possible to deduce computational expense of the different steps of the algorithm from that.

The problem is that the math is quite hard to understand for me. So I have no idea.

The algorithm has different implementations for a sparse Hessian matrix H and for a dense matrix. Generally, the sparse implementation is faster on large, sparse problems, and the dense implementation is faster on dense or small problems.'' Why is this the case?

Shouldn't it be dense,small problems in stead of dense or small problems?

Well for one thing, GPAD is not confined to travel through the interior of the feasible set as the interior- point method is (hence its name).

What is the advantage/disadvantage of this?

Matt J
on 12 Mar 2020 at 12:45

The problem is that the math is quite hard to understand for me. So I have no idea.

I gave you the wrong link earlier. It is now corrected. In any case, you will see in the description that there are various steps within each iteration requiring the solutions of linear equations. Those can be computationally burdensome, depending on the size and sparisty of your Hessian and constraints.

Shouldn't it be dense,small problems in stead of dense or small problems?

I don't think so. If your Hessian is large and dense, it would not be a good idea to use a sparse matrix data-type to represent it. You would want to use a dense type.

What is the advantage/disadvantage of this?

Roughly speaking, it lets you decompose the optimization into a series of unconstrained minimizations, which are easier than constrained ones. The solutions of those optimization problems lie in the interior of the feasible set, so you can find them just by setting gradients to zero.

S.
on 18 Mar 2020 at 11:20

What is the reason why it is chosen that quadprog can only solve convex problems and not non-convex problems? Is there another function that can solve non-convex QP problems?

What gives the best results? Using lower bound and upper bound or incorporating these bounds in equality constraints? What are the differences? And advantages and disadvantages of each method?

Matt J
on 18 Mar 2020 at 15:45

What is the reason why it is chosen that quadprog can only solve convex problems and not non-convex problems? Is there another function that can solve non-convex QP problems?

It is only quadprog's interior-point algorithm that requires convexity. The trust-region-refelective method doesn't require convexity, but it does have certain restrictions on the kind of constraints you can have (only equality constraints or bounds but not both).

You can always use fmincon. However, the ill-conditioning of your problem can have undesirable effects no matter what solver you use. It generally slows convergence, for one thing. It also makes the result sensitive to floating point errors. Effectively, you have a non-unique set of optimal points when your problem is ill-conditioned. Which solution you get can vary greatly from computer to computer depending on its CPU architecture and the way it does floating point arithmetic.

It is still not clear to me why you were able to get nowhere with the investigation of your Omega matrix and why cond(Omega)=Inf. This has to mean that the diagonal elements are very different in scale, and might be expressed in a poor choice of units. It should be very easy to determine if this is the case.

What gives the best results? Using lower bound and upper bound or incorporating these bounds in equality constraints? What are the differences? And advantages and disadvantages of each method?

Bounds are simpler for many solvers to deal with than more general inequalities. If you lump the bounds in with the other linear inequalities, the solver does not know that some of your constraints are simpler than others and won't take advantage of that.

S.
on 18 Mar 2020 at 17:50

In my case, no extra inequality constraits are present. Just the lower and upper bound which are written as inequality constraints, so each constraint is as simple/complex. Does the solver then still not see that each inequality constraint is just an upper or lower bound and therefore are as simple/complex?

Is a possible advantage of using lower and upper bound also that a solution may be found faster, because the space in which the algorithm can search is predefined and for inequality constraints the space is not directly restriced?

Furthermore, an advantage of inequality constraints is that it is possible to have a lower/upper bound on a linear combination of the decision variables, which is not possible with lower and upper bound.

Matt J
on 18 Mar 2020 at 18:40

Just the lower and upper bound which are written as inequality constraints, so each constraint is as simple/complex. Does the solver then still not see that each inequality constraint is just an upper or lower bound and therefore are as simple/complex?

Not as far as I'm aware. It could probably deduce that from a pre-analysis of your Aineq matrix, but I don't think it would bother. It has no reason to expect you to express bounds through Aineq,bineq when you have the option of doing so more directly through lb,ub.

Is a possible advantage of using lower and upper bound also that a solution may be found faster, because the space in which the algorithm can search is predefined and for inequality constraints the space is not directly restriced?

For some algorithms, yes. Some of the legacy algorithms in the toolbox, however, do restrict their search to the polyhedron specified by Aineq*x<=bineq (the active-set methods do this, I think). It is, of course, harder to do that for general linear inequalities than for simple bounds.

Furthermore, an advantage of inequality constraints is that it is possible to have a lower/upper bound on a linear combination of the decision variables, which is not possible with lower and upper bound.

That's not a reason, though, to lump your bounds into Aineq, bineq. You can use lb,ub to specify the bounds, but still use Aineq,bineq for the more complicated constraints that can't be represented through lb,ub.

S.
on 18 Mar 2020 at 20:49

For some algorithms, yes. Some of the legacy algorithms in the toolbox, however, do restrict their search to the polyhedron specified by Aineq*x<=bineq (the active-set methods do this, I think). It is, of course, harder to do that for general linear inequalities than for simple bounds

What is exaclty polyhedron? I read a lot about is, but the interpretation is not really clear to me. It is also for example being discussed about polyhedral input and state constraints. Are input and state constraints always polyhedral? What does polyhedral exactly mean in that context?

Does the interior-point-convex algorithm also belong to the ''some algorithms,yes''?

So, what you are saying is that the specification of the polyhedron is harder/costs more time for linear equality constraints than for bounds? And therefore my sentence: ''because the space in which the algorithm can search is predefined and for inequality constraints the space is not directly restriced'' is not correct. It is just the shape of the polyhedron that is restricted faster by the algorithm for bounds than for linear inequality constraints.

That's not a reason, though, to lump your bounds into Aineq, bineq. You can use lb,ub to specify the bounds, but still use Aineq,bineq for the more complicated constraints that can't be represented through lb,ub.

But is is still an kind of advantage with respect to bounds, because with bounds constraints on linear combinations cannot be specified.

Matt J
on 18 Mar 2020 at 23:14

What is exaclty polyhedron? I read a lot about is, but the interpretation is not really clear to me. It is also for example being discussed about polyhedral input and state constraints.

I was abusing terminology somewhat. A polyhedron is a 3D shape with flat sides. A convex polyhedron can always be described by a set of linear inequalities Aineq*x<=bineq. An N-dimensional shape described by a set of linear inequalities Aineq*x<=bineq is called a convex polytope.It is a generalization of a polyhedron from 3D to N-D.

Does the interior-point-convex algorithm also belong to the ''some algorithms,yes''?

I believe so. I haven't seen documentation to confirm it, but an interior-point algorithm by definition is one that tries to confine its search to the interior of the feasible set.

So, what you are saying is that the specification of the polyhedron is harder/costs more time for linear equality constraints than for bounds?

There are various advantages to constraints consisting of simple bounds that different algorithms may take advantage of in different ways. For one thing, it is easier to find an interior-point in a region specified by bounds. This is of course of high importance to an "interior-point" algorithm.

Incidentally, I was mistaken earlier when I said that the interior-point-convex algorithm will not pre-analyze your linear inequalities Aineq*x<=bineq to see if any of them are pure bounds. I found a section of the documentation that says otherwise:

"The algorithm first tries to simplify the problem by removing redundancies and simplifying constraints. The tasks performed during the presolve step can include the following:...Check if any linear inequality constraint involves only one variable. If so, check for feasibility, and then change the linear constraint to a bound. "

But as you can see, the fact that it does this step confirms that quadprog does give bound constraints special handling.

S.
on 20 Mar 2020 at 4:22

So, if there are polyhedron input and state constraints, the constraints describe a space ''the polyhedron'' where the algorithm should search for a solution.

And if you're Hessian is ill-conditioned and converges therefore slowly, the ''interior-point'' algorithm cannot find a solution in this polyhedron and the algorithm jumps around in the solution space. Why does it then jump around? Due to that the solution does not change every step the interior point makes in the space?

And what is the reason that a Hessian should be positive definite?

Matt J
on 20 Mar 2020 at 14:25

S.
on 23 Mar 2020 at 1:25

That's true. My questions were going a bit off-track. I will open a new thread. You can find it here:

One more question which is relevant regarding quadprog. It seems that quadprog, the (conventional) interior-point algorithm, is not exploiting the sparsity and structure of the sparse QP formulation. Do you have a reason for that?

In MPC, the computational complexity should scale linerarly with the prediction horizon N. However, results show that the complexity scales quadratically with the prediction horizon N.

S.
on 23 Mar 2020 at 2:07

Yes, I did. For example if the Hessian is: H= 2*Z*Omega*Z, then Z and Omega were constructed as sparse(Omega) and sparse(Z).

However, another point what I realize now, for the dense matrices, I gave the matrices in sparse format. Is that a big deal, that I did not pass them as full()?

Matt J
on 23 Mar 2020 at 15:18

Ok, but do you know why the (conventional) interior-point algorithm of quadprog, is not exploiting the sparsity and structure of the sparse QP formulation.

I can't be certain that it isn't. I have not run your code myself.

S.
on 23 Mar 2020 at 16:26

Well, if I look at my plot, the computation time scales quadratically with N instead of linearly with N. So, it is not exploiting it fully, but just partly. If the algorithm would not exploit the sparse structure at all the computation time would scale cubically with N, i.e. the algorithm requires O(N^3(number of inputs + number of states)^3) operations.

If it is not exploiting it fully, is there maybe some possible explanation which you can think of? It does not have to be necessarily the case, but a possible explanation would already help me further.

Matt J
on 23 Mar 2020 at 17:33

S.
on 23 Mar 2020 at 18:06

I'm calculating the computation time as the time required to solve one QP (and this for multiple time steps). So the computation time is not the time of solving one QP divided by the number of iterations to come to that solution (and that for multiple time steps).

I said it wrong. Quadprog is exploiting the sparse not at all as it scales cubically with N (see attached figure). I also added the scripts. Optimalisatieprobleem2.m is the main script.

Hopefully you can give now a better guess.

Matt J
on 23 Mar 2020 at 18:32

The "output" argument from quadprog will contain the number of iterations used, as well as other info.

[x,fval,exitflag,output] = quadprog(___)

I will not have time to look at the scripts anytime soon, but it is irrelevant for now. The computation time per iteration is the first thing you should check.

S.
on 23 Mar 2020 at 19:12

Oke, then I know now how to do that.

So I have to check whether the computation per iteration scales with O(N) in stead the computation time of one QP scales with O(N). So only the number of required iterations could also scale with O(N)?

Furthermore, the very last question :), if another solver, gets the same optimized values x, but the fval is different. How is that possible? For example quadprog and Gurobi (another solver) are given me the same x and fval, but GPAD gives me the same x, but an fval, which is a factor 10 bigger compared to quadprog and Gurobi.

Sign in to comment.

Sign in to answer this question.

Opportunities for recent engineering grads.

Apply TodayUnable to complete the action because of changes made to the page. Reload the page to see its updated state.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)

## 1 Comment

## Direct link to this comment

https://au.mathworks.com/matlabcentral/answers/501615-quadprog-message-the-problem-is-non-convex#comment_807170

⋮## Direct link to this comment

https://au.mathworks.com/matlabcentral/answers/501615-quadprog-message-the-problem-is-non-convex#comment_807170

Sign in to comment.