We consider the optimization problem
• The function f: ℝn → ℝ that we wish to minimize is a real-valued function called the objective function or cost function.
• The vector x is an n-vector of independent variables: x = [x1, x2, …, xn]⊤ ∈ ℝn. The variables x1, x2, …, xn are often referred to as decision
variables.
• The set Ω is a subset of ℝn called the constraint set or feasible set.
○ Often, the constraint set Ω takes the form Ω = {x: h(x) = 0, g(x) ≤ 0}, where h and g are given functions.
• We refer to such constraints as functional constraints.
• The optimization problem above can be viewed as a decision problem that involves finding the “best” vector x of the decision variables over all
possible vectors in Ω.
○ By the “best” vector we mean the one that results in the smallest value of the objective function.
• This vector is called the minimizer of f over Ω.
• It is possible that there may be many minimizers. In this case, finding any of the minimizers will suffice.
• There are also optimization problems that require maximization of the objective function, in which case we seek maximizers.
○ Minimizers and maximizers are also called extremizers.
○ Maximization problems, however, can be represented equivalently in the minimization form above because maximizing f is equivalent to
minimizing –f.
• Therefore, we can confine our attention to minimization problems without loss of generality.
• The problem above is a general form of a constrained optimization problem, because the decision variables are constrained to be in the
constraint set Ω.
○ If Ω = ℝn, then we refer to the problem as an unconstrained optimization problem.
• In this session, we discuss basic properties of the general optimization problem above, which includes the unconstrained case.
• During next couple of sessions, we deal with iterative algorithms for solving unconstrained optimization problems.
Global Minimizer vs. Local Optimizer:
• Suppose that f: ℝn → ℝ is a real-valued function defined on some set Ω ⊂ ℝn.
○ A point x* ∈ Ω is a local minimizer of f over Ω if there exists ε > 0 such that f(x) ≥ f(x*) for all x ∈ Ω\{x*} and ∥x – x*∥ < ε.
○ A point x* ∈ Ω is a global minimizer of f over Ω if f(x) ≥ f(x*) for all x ∈ Ω\{x*}.
• If in the definitions above we replace “≥” with “>,” then we have a strict local minimizer and a strict global minimizer, respectively.
This is illustrated in the following figure.
• If x* is a global minimizer of f over Ω, we write f(x*) = minx∈Ω f(x) and x* = arg minx∈Ω f(x).
• If the minimization is unconstrained, we simply write f(x*)= minx f(x) or x* = arg min f(x).
○ In other words, given a real-valued function f, the notation arg min f(x) denotes the argument that minimizes the function f (a point in the
domain of f), assuming that such a point is unique (if there is more than one such point, we pick one arbitrarily).
• For example, if f: ℝ → ℝ is given by f(x) = (x + 1)2 + 3, then arg min f(x) = –1.
○ If we write arg minx∈Ω, then we treat “x ∈ Ω” to be a constraint for the minimization.
• For example, for the function f above, arg minx≥0 f(x) = 0.
• Strictly speaking, an optimization problem is solved only when a global minimizer is found.
○ However, global minimizers are, in general, difficult to find.
• Therefore, in practice, we often have to be satisfied with finding local minimizers :( or maybe NOT!
Conditions for Local Minimizers:
In this section we derive conditions for a point x* to be a local minimizer. We use derivatives of a function f: ℝn → ℝ. Recall that the first-order
derivative of f, denoted Df, is
• Note that the gradient ∇f is just the transpose of Df; that is, ∇f= (Df)⊤.
• The second derivative of f: ℝn → ℝ (also called the Hessian of f) is
Example 1) Let f(x1,x2)=5x1+8x2+x1x2−x12−2x22. Then, find its gradient and Hessian.
• Given an optimization problem with constraint set Ω, a minimizer may lie either
a. in the interior of Ω or
b. on the boundary of Ω.
• To study the case where it lies on the boundary, we need the notion of feasible directions.
Definition: A vector d ∈ ℝn, d ≠ 0, is a feasible direction at x ∈ Ω if there exists α0 > 0 such that x + αd ∈ Ω for all α ∈ [0, α0].
• Two-dimensional illustration of feasible directions; d1 is a feasible direction, d2 is not a feasible direction.
Let f: ℝn → ℝ be a real-valued function and let d be a feasible direction at x ∈ Ω. The directional derivative of f in the direction d, denoted ∂f/∂d,
is the real-valued function defined by
• If ∥d∥ = 1, then ∂f/∂d is the rate of increase of f at x in the direction d.
○ To compute the directional derivative above, suppose that x and d are given. Then, f(x + αd) is a function of α, and
Applying the chain rule yields
In summary, if d is a unit vector (∥d∥ = 1), then ⟨∇f(x),d⟩ is the rate of increase of f at the point x in the direction d.
Example 2) Define f: ℝ3 → ℝ by f(x) = x1x2x3, and let d=[0.5, 0.5 0.707]T. Find the directional derivative of f in the direction d .
Theorem 1) First-Order Necessary Condition (FONC).
Let Ω be a subset of ℝn and f ∈ C1 a real-valued function on Ω. If x* is a local minimizer of f over Ω, then for any feasible direction d
at x*, we have
• Illustration of the FONC for a constrained case; x1 does not satisfy the FONC, whereas x2 satisfies the FONC.
• An alternative way to express the FONC is for all feasible directions d.
• In other words, if x* is a local minimizer, then the rate of increase of f at x* in any feasible direction d in Ω is nonnegative.
• A special case of interest is when x* is an interior point of Ω. In this case, any direction is feasible, and we have the following result.
Corollary 1) Interior Case of FONC.
Let Ω be a subset ℝn and f ∈ C1 a real-valued function on Ω. If x* is a local minimizer of f over Ω and if x* is an interior point of Ω, then
Example 3) Consider the problem
a. Is the first-order necessary condition (FONC) for a local minimizer satisfied at x = [1, 3]⊤?
b. Is the FONC for a local minimizer satisfied at x = [0, 3]⊤?
c. Is the FONC for a local minimizer satisfied at x = [1, 0]⊤?
d. Is the FONC for a local minimizer satisfied at x= [0, 0]⊤?
Example 4) Consider the set-constrained problem
where Ω={[x1,x2]⊤:x12+x22=1}.
a. Consider a point x* ∈ Ω. Specify all feasible directions at x*.
b. Which points in Ω satisfy the FONC for this set-constrained problem?
c. Based on part b, is the FONC for this set-constrained problem useful for eliminating local-minimizer candidates?
Theorem 2) Second-Order Necessary Condition (SONC).
Let Ω ⊂ ℝn, f ∈ C2 a function on Ω, x* a local minimizer of f over Ω, and d a feasible direction at x*. If d⊤∇f(x*) = 0, then
where F is the Hessian of f.
Corollary 2) Interior Case of SONC.
Let x* be an interior point of Ω ⊂ ℝn. If x* is a local minimizer of f: Ω → ℝ, f ∈ C2, then
and F(x*) is positive semidefinite (F(x*) ≥ 0); that is, for all d ∈ ℝn,
Example 5) Consider a function of one variable f(x) = x3, f: ℝ → ℝ. Because f′(0) = 0, and f″(0) = 0, the point x = 0 satisfies both the FONC and SONC.
However, x = 0 is not a minimizer
…. so, this means that necessary conditions are NOT sufficient!
Example 6) Consider a function f: ℝ2 → ℝ, where f(x)=x12−x22. The FONC requires that ∇f(x) = [2x1, –2x2]⊤ = 0. Thus, x = [0, 0]⊤ satisfies the
FONC. The Hessian matrix of f is
The Hessian matrix is indefinite; that is, for some d1 ∈ ℝ2 we have d1⊤Fd1>0 (e.g., d1 = [1, 0]⊤) and for some d2 we have d2⊤Fd2<0 (e.g., d2 = [0,
1]⊤). Thus, x = [0, 0]⊤ does not satisfy the SONC, and hence it is not a minimizer.
Theorem 3) Second-Order Sufficient Condition (SOSC), Interior Case.
Let f ∈ C2 be defined on a region in which x* is an interior point. Suppose that
• 1. ∇f(x*) = 0.
• 2. F(x*) > 0.
Then, x* is a strict local minimizer of f.
Example 7) Let f(x)=x12+x22. Check if x = [0, 0]⊤ is a strict local minimizer.