Matlab Codes for Implicitly Constrained Optimization Problems
These are the Matlab codes used in the 2008 version of the paper
M. Heinkenschloss:
Numerical Solution of Implicitly Constrained Optimization Problems.
Let f: Rny × Rnu → R and
c: Rny × Rnu → Rny
be given smooth functions.
Assume that for every u the equation
(1)
c(y,u) = 0
has a unique solution y(u).
We define the function φ: Rnu → R ,
(2)
φ(u) = f(y(u),u)
and consider the imlicitly constrained optimization problem
(3)
min g(u).
(Note: In this html document I use φ to denote the reduced function instead of 'hatf '.)
The paper M. Heinkenschloss:
Numerical Solution of Implicitly Constrained Optimization Problems
discusses the application of optimization algorithms for the solution of (3).
This page contains links to the Matlab code used in that paper.
Download a zip file with all Matlab functions or download
individual functions below. The functions are organized into three different
subdirectories shown below.
- optimization
Code for the solution of min f(x) , where f: Rn → R.
Note, the optimization algorithms use the generic notation typically found in unconstrained
optimization, rather than the notation in (3). That is, in the optimization algorithms the unkowns are
denoted by x and the objective function is denoted by f.
- newton_cg.m:
This is an implementation of the Newton-CG method described, e.g., in Section 7.1 of the book
J. Nocedal and S. J. Wright: Numerical Optimization, second edition,
Springer Verlag, Berlin, Heidelberg, New York, 2006.
- lbfgs.m:
This is an implementation of the limited BFGS method described, e.g., in Section 7.2 of the book
J. Nocedal and S. J. Wright: Numerical Optimization, second edition,
Springer Verlag, Berlin, Heidelberg, New York, 2006.
However, this implementation uses an Armijo linear search or a backtracking line-search.
Both, unlike a line search that enforces the so-called Wolfe condition (see the book by
Nocedal and Wright), do not guarantee that ykTsk>0
(here we use notation of Nocedal and Wright). We skip the update if
ykTsk>0 is not satisfied. It is easy to replace
the existing line search algorithms with a line search algorithm that satisfies the Wolfe
conditions (for details see pages 60-62 in J. Nocedal and S. J. Wright:
Numerical Optimization, second edition, Springer Verlag, Berlin, Heidelberg, New York, 2006).
- mycg.m: Implementation of the conjugate gradient
method for computation of approximate Newton steps.
Called in newton_cg.m.
- lnsrch_arm.m: Implementation of the
Armijo line-search.
- lnsrch_bt.m: Implementation of the
backtracking line-search. This implemetation follows that in
J. E. Dennis, Jr., and R. B. Schnabel: Numerical Methods for
Nonlinear Equations and Unconstrained Optimization, SIAM, Philadelphia, 1996.
- deriv_check.m:
Function that, given the user interface described below performs some basic derivative checks.
To use the functions above, the user has provide the following Matlab functions.
All Matlab functions have an input parameter usr_par.
The variable usr_par is never accessed in the optmization codes,
but can be used to pass parameters and other problem specific information to the user provided functions.
- function [fx] = fval(x, usr_par) : Evaluate f(x).
usr_par is a variable that is never accessed in the optmization codes,
but can be used to pass parameters and other problem specific information
to the function.
- function [gx] = grad(x, usr_par): Evaluate ∇ f(x).
- function [Hv] = Hessvec(v, x, usr_par): Evaluate ∇ 2 f(x) v.
(Only used in newton_cg.m.)
- function [Hv] = H0vec(v, usr_par): Compute H0 v,
where H0 v is the initial BFGS matrix, which
is a replacement of the inverse of the Hessian of f.
(Only used in lbfgs.m.)
- function [usr_par] = xnew( x, iter, usr_par)i: xnew is called
whenever a new x is generated and it is called before any of the three functions
fval, grad, and Hessvec are called with this new x.
- function [x1x2] = xprod( x1, x2, usr_par): All optimization
algorithms above are implemented using an arbitrary inner product
< x1, x2 > between two vectors x1, x2.
A properly chosen inner product can substantially improve the performance of the
optimization algorithm. This is especially important for problems involving discretizations of
differential equations. See Section 2 of
M. Heinkenschloss and L. N. Vicente: An Interface Between Optimization and Application
for the Numerical Solution of Optimal Control Problems,
ACM Transactions on Mathematical Software, Vol. 25 (1999), pp. 157-190 for a relatively
elementary exposition and more references.
In both examples below, we use the standard Euclidean inner product
< x1, x2 > = x1T x2 and
with this choice, newton_cg.m
lbfgs.m, and mycg.m
correspond to the Newton-CG, LBFGS and conjugate gradient methods discussed, e.g.,
in J. Nocedal and S. J. Wright: Numerical Optimization, second edition,
Springer Verlag, Berlin, Heidelberg, New York, 2006.
- rosenbrock
The codes in this section apply newton_cg.m
and lbfgs.m to minimize the 2D Rosenbrock function,
i.e., to solve
min (1-x1)2 + 100( x1 - x12)2.
This is not an implicitly constrained problem, but is included to illustrate how to apply
the optimization algorithms to a simple model problem.
- burgers
The codes in this section apply newton_cg.m
and lbfgs.m to solve the discretized optimal control
problem governed by Burgers equation as described in
the paper M. Heinkenschloss:
Numerical Solution of Implicitly Constrained Optimization Problems.
- optimization_driver.m:
The main script.
- prob_gen.m: Computes quantities like the
stiffness matrix Ah. These are stored as global variables.
- state.m: Compute a solution of
the discretized Burgers equation.
- adjoint.m: Compute a solution of
the discre adjopint equation.
- Ny.m, Nyp.m:
Evaluate the nonlinearity in the discretized Burgers equation and the Jacobian of the
nonlinearity in the discretized Burgers equation, respectively.
- xnew.m: Solve the Burgers equation and
solve the adjoint equation. The solution to Burgers equation and to the adjoint equation
are stored as global variables so that they can be accessed by
fval.m,
grad.m, and
Hessvec.m.
Furthermore, the current control, the state and the adjoint are plotted.
(Note that we could have computed the solution to the adjoint equation later in
grad.m).
- fval.m,
grad.m,
Hessvec.m:
Implemtentation of function evaluation, gradient evaluation using the adjoint approach,
and Hessian-times-vector computation.
- H0vec.m: Compute the
- xprod.m:
For two vectors x1, x2
this function returns the Euclidean inner product x1T x2.