Main Content

Given a set of *n* nonlinear functions
*F _{i}*(

`fsolve`

attempts to solve a system of
equations by minimizing the sum of squares of the components. If the sum of squares
is zero, the system of equations is solved. `fsolve`

has three
algorithms:

Trust-region

Trust-region-dogleg

Levenberg-Marquardt

All algorithms are large scale; see Large-Scale vs. Medium-Scale Algorithms.

The `fzero`

function solves a single
one-dimensional equation.

The `mldivide`

function solves a system of
linear equations.

Many of the methods used in Optimization Toolbox™ solvers are based on *trust regions,* a simple yet
powerful concept in optimization.

To understand the trust-region approach to optimization, consider the
unconstrained minimization problem, minimize
*f*(*x*), where the function takes vector
arguments and returns scalars. Suppose that the current point is
*x* in *n*-space and you want to improve by
moving to a point with a lower function value. To do so, the algorithm approximates
*f* with a simpler function *q*, which
reasonably reflects the behavior of function *f* in a neighborhood
*N* around the point *x*. This neighborhood is
the trust region. The solver computes a trial step *s* by
minimizing (or approximately minimizing) over *N*. The trust-region
subproblem is

$$\underset{s}{\mathrm{min}}\left\{q(s),\text{}s\in N\right\}.$$

The solver updates the current point to
*x* + *s* if *f*(*x* + *s*) < *f*(*x*); otherwise, the current point remains unchanged and the solver
shrinks *N* (the trust region) and repeats the trial step
computation.

The key questions in defining a specific trust-region approach to minimizing
*f*(*x*) are how to choose and compute the
approximation *q* (defined at the current point
*x*), how to choose and modify the trust region
*N*, and how accurately to solve the trust-region
subproblem.

In the standard trust-region method ([48]), the quadratic approximation *q* is defined by the first two
terms of the Taylor approximation to *F* at *x*.
The neighborhood *N* is usually spherical or ellipsoidal in shape.
Mathematically, the trust-region subproblem is typically stated

$$\mathrm{min}\left\{\frac{1}{2}{s}^{T}Hs+{s}^{T}g\text{suchthat}\Vert Ds\Vert \le \Delta \right\},$$ | (1) |

where *g* is the gradient of *f* at the current
point *x*, *H* is the Hessian matrix (the
symmetric matrix of second derivatives), *D* is a diagonal scaling
matrix, Δ is a positive scalar, and ∥ . ∥ is the 2-norm. To solve Equation 1, an algorithm (see [48]) can compute all eigenvalues of *H* and then apply a Newton
process to the secular equation

$$\frac{1}{\Delta}-\frac{1}{\Vert s\Vert}=0.$$

Such an algorithm provides an accurate solution to Equation 1. However, this requires time proportional to
several factorizations of *H*. Therefore, trust-region problems
require a different approach. Several approximation and heuristic strategies, based
on Equation 1, have been proposed in the literature ([42] and [50]). Optimization Toolbox solvers follow an approximation approach that restricts the
trust-region subproblem to a two-dimensional subspace *S* ([39] and [42]). After the solver computes the subspace *S*, the
work to solve Equation 1 is trivial because,
in the subspace, the problem is only two-dimensional. The dominant work now shifts
to the determination of the subspace.

The solver determines the two-dimensional subspace *S* with the
aid of a preconditioned conjugate gradient method (described in the next section).
The solver defines *S* as the linear space spanned by
*s*_{1} and
*s*_{2}, where
*s*_{1} is in the direction of the gradient
*g*, and *s*_{2} is either
an approximate Newton direction, that is, a solution to

$$H\cdot {s}_{2}=-g,$$

or a direction of negative curvature,

$${s}_{2}^{T}\cdot H\cdot {s}_{2}<0.$$

The philosophy behind this choice of *S* is to force global
convergence (via the steepest descent direction or negative curvature direction) and
achieve fast local convergence (via the Newton step, when it exists).

The process of unconstrained minimization using the trust-region approach is now easy to specify:

Formulate the two-dimensional trust-region subproblem.

Solve Equation 1 to determine the trial step

*s*.If

*f*(*x*+*s*) <*f*(*x*), then*x*=*x*+*s*.Adjust Δ.

The solver repeats these four steps until convergence, adjusting he trust-region
dimension Δ according to standard rules. In particular, the solver decreases the
trust-region size if it does not accept the trial step, when *f*(*x* + *s*) ≥
*f*(*x*). See [46] and [49] for a discussion of this aspect.

Optimization Toolbox solvers treat important cases of *f* with specialized
functions: nonlinear least-squares, quadratic functions, and linear least-squares.
However, the underlying algorithmic ideas are the same as for the general
case.

A popular way to solve large, symmetric, positive definite systems of linear equations *Hp* = –*g* is the method of Preconditioned Conjugate Gradients (PCG). This iterative
approach requires the ability to calculate matrix-vector products of the form
*H·v* where *v* is an arbitrary vector. The
symmetric positive definite matrix *M *is a*
preconditioner* for *H*. That is, *M* = *C*^{2}, where *C*^{–1}*HC*^{–1} is a well-conditioned matrix or a matrix with clustered eigenvalues.

In a minimization context, you can assume that the Hessian matrix *H* is
symmetric. However, *H* is guaranteed to be positive definite only in the
neighborhood of a strong minimizer. Algorithm PCG exits when it encounters a direction of
negative (or zero) curvature, that is, *d ^{T}Hd* ≤ 0. The PCG output direction

Another approach is to solve a linear system of equations to find the search
direction. Newton's method specifies to solve for the search direction
*d _{k}* such that

*J*(*x _{k}*)

where *J*(*x _{k}*) is the

$$J\left({x}_{k}\right)=\left[\begin{array}{c}\nabla {F}_{1}{\left({x}_{k}\right)}^{T}\\ \nabla {F}_{2}{\left({x}_{k}\right)}^{T}\\ \vdots \\ \nabla {F}_{n}{\left({x}_{k}\right)}^{T}\end{array}\right].$$

Newton's method can be problematic.
*J*(*x _{k}*) might be
singular, in which case the Newton step

Using trust-region techniques (introduced in Trust-Region Methods for Nonlinear Minimization) handles the case when
*J*(*x _{k}*) is singular
and improves robustness when the starting point is far from the solution. To use a
trust-region strategy, you need a merit function to decide if

$$\underset{d}{\mathrm{min}}f(d)=\frac{1}{2}F{\left({x}_{k}+d\right)}^{T}F\left({x}_{k}+d\right).$$

But a minimum of * f*(*d*) is not necessarily a
root of *F*(*x*).

The Newton step *d _{k}* is a root of

*M*(*x _{k}* +

so it is also a minimum of *m*(*d*),
where

$$\begin{array}{c}\underset{d}{\mathrm{min}}m(d)=\frac{1}{2}{\Vert M\left({x}_{k}+d\right)\Vert}_{2}^{2}=\frac{1}{2}{\Vert F\left({x}_{k}\right)+J\left({x}_{k}\right)d\Vert}_{2}^{2}\\ =\frac{1}{2}F{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+{d}^{T}J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+\frac{1}{2}{d}^{T}J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)d.\end{array}$$ | (2) |

*m*(*d*) is a better choice of merit function
than *f*(*d*), so the trust-region subproblem
is

$$\underset{d}{\mathrm{min}}\left[\frac{1}{2}F{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+{d}^{T}J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right)+\frac{1}{2}{d}^{T}J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)d\right],$$ | (3) |

such that ∥*D*·*d*∥ ≤ Δ. You can solve this subproblem efficiently using a dogleg
strategy.

For an overview of trust-region methods, see Conn [4] and Nocedal [31].

The key feature of the trust-region-dogleg algorithm is the use of the Powell
dogleg procedure for computing the step *d*, which minimizes
Equation 3. For a detailed description, see Powell [34].

The algorithm constructs the step *d* from a convex
combination of a Cauchy step (a step along the steepest descent direction) and a
Gauss-Newton step for *f*(*x*). The Cauchy
step is calculated as

*d _{C}* =
–

where *α* minimizes Equation 2.

The Gauss-Newton step is calculated by solving

*J*(*x _{k}*)·

using the MATLAB^{®}
`mldivide`

(matrix left division)
operator.

The algorithm chooses the step *d* so that

*d* = *d _{C}*
+

where *λ* is the largest value in the interval [0,1] such
that ∥*d*∥ ≤ Δ. If *J _{k}* is (nearly)
singular,

The trust-region-dogleg algorithm is efficient because it requires only one linear solve per iteration (for the computation of the Gauss-Newton step). Additionally, the algorithm can be more robust than using the Gauss-Newton method with a line search.

The Levenberg-Marquardt algorithm ([25], and [27]) uses a search direction that is a solution of the linear set of equations

$$\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)+{\lambda}_{k}I\right){d}_{k}=-J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right),$$ | (4) |

or, optionally, of the equations

$$\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)+{\lambda}_{k}\text{diag}\left(J{\left({x}_{k}\right)}^{T}J\left({x}_{k}\right)\right)\right){d}_{k}=-J{\left({x}_{k}\right)}^{T}F\left({x}_{k}\right),$$ | (5) |

where the scalar *λ _{k}* controls both the
magnitude and direction of

`fsolve`

option `ScaleProblem`

to
`'none'`

to use Equation 4, or set this option to
`'jacobian'`

to use Equation 5.When *λ _{k}* is zero, the direction

`fzero`

attempts to find the root of a scalar function
*f* of a scalar variable *x*.

`fzero`

looks for an interval around an initial point such that
*f*(*x*) changes sign. If you specify an
initial interval instead of an initial point, `fzero`

checks to
make sure that *f*(*x*) has different signs at the
endpoints of the interval. The initial interval must be finite; it cannot contain
±`Inf`

.

`fzero`

uses a combination of interval bisection, linear
interpolation, and inverse quadratic interpolation in order to locate a root of
*f*(*x*). See `fzero`

for more information.

The `\`

algorithm is described in the MATLAB arithmetic operators section for `mldivide`

.