A Newton-type Method for Constrained Optimization
Titus Pința
Newton differentiability subsumes all aspects of Newton's method
-
First Order Methods
-
Stochastic Optimization
-
Quasi-Newton Methods
-
Equations on Metric Spaces
-
Constrained Optimization
-
\(\gdef\bbRn{\mathbb{R}^n}\mathcal{N}_{\mathcal{H}F}:\bbRn \rightrightarrows \bbRn\),
\[
\mathcal{N}_{\mathcal{H}F} x = \{x + d~|~H \in \mathcal{H}F(x), Hd = -F(x)\},
\]
is the Newton operator
-
\(\gdef\bbN{\mathbb{N}}{\{x^k\}}_{k \in \bbN}\), \(x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k\) is the Newton-type method
- $\gdef\xbar{\bar{x}}F(\xbar) = 0$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
Then any Newton-type method converges
superlinearly
for any $x^0$close enough to $\xbar$
- $\gdef\xbar{\bar{x}}F(\xbar) = 0$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
Then any Newton-type method converges
superlinearly
for any $x^0$close enough to $\xbar$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- $F$ is $\mathcal{C^{1}}$, $\mathcal{H}F = \nabla F$
- $F$ is semismooth
with $\mathcal{H}F = F'$
- $F$ is tame (definable in an $o$-minimal structure)
with $\mathcal{H}F = F'$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- 1
\[
F(x, y) =
\begin{bmatrix}
4(x^2 - y) + 2(x - 1) \\ -2(x^2 - y)
\end{bmatrix}
\quad
\mathcal{H}F(x, y) = \left\{2
\begin{bmatrix}
1 + 4 x^2 & -2 x \\ -2 x & 1
\end{bmatrix}\right\}
\]
-
\[\gdef\Id{\operatorname{Id}}
F(x) = \left \{
\begin{array}{rl}
x & \quad x \in \mathbb{Q}, \\
-x & \quad x \in \bbR \setminus \mathbb{Q}
\end{array}
\right. \quad \mathcal{H}F(x) = \left \{
\begin{array}{rl}
1 & \quad x \in \mathbb{Q}, \\
-1 & \quad x \in \bbR \setminus \mathbb{Q}
\end{array}
\right.
\]
-
$F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
weakly Newton differentiable
)
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- 1
\[
F(x, y) =
\begin{bmatrix}
4(x^2 - y) + 2(x - 1) \\ -2(x^2 - y)
\end{bmatrix}
\quad
\mathcal{H}F(x, y) = \left\{2
\begin{bmatrix}
1 + 4 x^2 & -2 x \\ -2 x & 1
\end{bmatrix}\right\}
\]
-
\[\gdef\Id{\operatorname{Id}}
F(x) = \left \{
\begin{array}{rl}
x & \quad x \in \mathbb{Q}, \\
-x & \quad x \in \bbR \setminus \mathbb{Q}
\end{array}
\right. \quad \mathcal{H}F(x) = \left \{
\begin{array}{rl}
1 & \quad x \in \mathbb{Q}, \\
-1 & \quad x \in \bbR \setminus \mathbb{Q}
\end{array}
\right.
\]
-
$F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
weakly Newton differentiable
)
-
$F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
weakly Newton differentiable
)
$F:\bbRn \to \bbRn$ is
weakly Newton differentiable at $\xbar$:
$\exists\gdef\setto{\rightrightarrows}\gdef\bbRnxn{\mathbb{R}^{n \times n}}\mathcal{H}F:\bbRn \setto \bbRnxn$
\[ \lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\frac{\|F(x) - F(\xbar) - H(x - \xbar)\|}{\|x - \xbar\|} < \infty \]
- $F(\xbar) = 0$
- $F:\bbRn \to \bbRn$ is weakly Newton differentiable at $\xbar$ with $\mathcal{H}F$
- $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
- $\Omega\lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\|F(x) - F(\xbar) - H(x - \xbar)\| \le \|x - \xbar\|$
Then any Newton-type method converges
linearly
for any $x^0$ close enough to $\xbar$
-
$F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$
For $F = \nabla f$, then $\mathcal{N}_{\alpha \Id}x = x - \alpha^{-1} \nabla f(x)$
If, either
-
$\nabla f$ Lipschitz,
hypo-monotone
and
metric subregular
-
$f$ has the KŁ property
and ${\{x^k\}}_{k \in \bbN}$ has
sufficient decrease
and a gradient lower bound
then $\nabla f$ is weakly Newton differentiable and Newton's method converges linearly
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- $F(\xbar) = 0$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
Then any Newton-type method converges
superlinearly
for any $x^0$
close enough to $\xbar$
- $F(\xbar) = 0$
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
Then any Newton-type method converges
superlinearly
for any $x^0$
close enough to $\xbar$
-
If $F$ is
Fréchet
differentiable on a set, with $\nabla F$
Lipschitz
continuous,
\[\|\mathcal{H}F(x) - \mathcal{H}F(y)\| \le L\|x - y\|\vphantom{\|^{\gamma-1}} \]
-
$B := \|{\mathcal{H}F(x^0)}^{-1}\|$ and $\eta := \|{\mathcal{H}F(x^0)}^{-1}F(x^0)\|$,
\[LB\eta \le \frac{1}{2}
\]
then $\exists \xbar$ such that
$F(\xbar) = 0$
-
If $F$ is
Newton
differentiable on a set, with $\gdef\tbar{\bar{t}}\mathcal{H}F$
Hölder
continuous,
\[\|\mathcal{H}F(x) - \mathcal{H}F(y)\| \le L\|x - y\|^{\gamma-1}\]
-
$B := \|{\mathcal{H}F(x^0)}^{-1}\|$ and $\eta := \|{\mathcal{H}F(x^0)}^{-1}F(x^0)\|$,
\[
\begin{aligned}
&\frac{LB}{2}{\left(\frac{f(t)}{f'(t)}\right)}^\gamma \le
f\left(t - \frac{f(t)}{f'(t)}\right),
&
f(0) = \eta,\quad
f(t) > 0,\quad f(\tbar) = 0, \\
&f'(t) < 0,\quad f'(t) \ge {(1 + L t)}^{-1},
& f''(t) > 0
\end{aligned} \]'
then $\exists \xbar$ such that
$F(\xbar) = 0$
Modified Newton Methods
$F$ is Newton differentiable with $\mathcal{H}F$:
- is Newton differentiable with $\mathcal{G}F(x) = \mathcal{H}F(x) + B_{\varepsilon(x)}[0]$
- is Newton differentiable with $\mathcal{G}F(x)=\mathcal{H} F(\phi(x))$ for $\lim_{x \to \xbar}\phi(x) \to \xbar$
- is Newton differentiable with any $\mathcal{G}F$ with
\[\gdef\Hbar{\bar{H}}
\lim_{x \to \xbar} \sup_{H \in \mathcal{H}F(x), G \in \mathcal{G}F(x)}
\frac{\|H(x - \xbar) - G(x - \xbar)\|}{\|x - \xbar\|} = 0
\]
- is Newton differentiable for $H$ produced by a quasi-Newton update rule
$\mathcal{U}:\bbRn \times \bbRnxn \to \bbRnxn$ with
\[
\lim_{x, H \to \xbar, \Hbar} \frac{\|\mathcal{U}(x, H) - \Hbar\|_F}{\|H - \Hbar\|_F} = 0
\]
Stochastic Newton
In the setting of optimization
$H$ is sampled randomly and we employ a backtracking strategy for sufficient
decrease, i.e.
\[f(x^k) - f(y) \ge c_k\|y - x^k\|\]
Assume for a given $x$, $\mathbb{P}(H \in \mathcal{H}F(x)) > 1 - \delta$ and let $K$ the
number of iterations
until $\gdef\bbE{\mathbb{E}}\nabla f(x^K) \le \varepsilon$.
Then
\[\bbE(K) \le \frac{1}{1 - \delta}\left(
\frac{f(x^0) - f(\xbar)}{\rho_{\min}{(d\varepsilon)}^2}
+ \frac{\log \frac{\rho_{\min}}{\rho^0}}{\log \alpha }
\right) \]
In particular, for
additive Gaussian noise
\[\bbE(K) \le \frac{1}{1 - n^{-1}}\left(\frac{f(x^0) - f(\xbar)}{0.5(1-c){(\omega^{-1}\varepsilon)}^2}
+ \frac{\log \frac{0.5(1-c)}{\rho^0}}{\log \alpha } \right) \]
Quasi-Metric Spaces
$\gdef\bbM{\mathbb{M}}H:\bbM \times \bbM \to \bbRn$ is called
pseudo-linear if $H(x, x) = 0$
$H^{-}:\bbM \times \bbRn \to \bbM$ is a quasi-inverse if $H^{-}(x, 0) = 0$
\[\gdef\dist{\operatorname{dist}}
\begin{aligned}
\exists \|H^{-}\| &\in [0, \infty) \\ &\dist(H^{-}(x, v), H^{-}(y, w))
\le \|H^{-}\|\|v - w - H(x, y)\|
\end{aligned}
\]
$F:\bbM \to \bbRn$ is Newton differentiable at $\xbar$ if there is
$\mathcal{H}F$ with
\[
\lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\frac{\|F(x) - F(\xbar) - H(x, \xbar)\|}{\dist(x, \xbar)} = 0
\]
- Convergence Theorem
- Kantorovich Theorem
Warm Up
\[
\gdef\minim{\operatorname{minimize}}
\gdef\st{\operatorname{subject\ to}}
\gdef\minimize#1#2#3{{\begin{array}{ll}%
\displaystyle{\underset{#1}{\minim}}\quad& #2 \\%
\st & #3%
\end{array}}}
\gdef\minprob#1#2{\begin{array}{ll}%
\displaystyle{\minim_{#1}}\quad& #2 \\%
\end{array}}
\minimize{x \in \bbRn}{f(x)}{Ax = b}\]
Consider $\tilde{f}:\bbR^{n - m} := \ker A \to \bbR$, $\tilde{f} = f|_{Ax = b}$
Remains to relate $\nabla \tilde{f}$ and $\nabla^2 \tilde{f}$ to $f$
\[\gdef\proj{\operatorname{P}}
\nabla \tilde{f} = \proj_{\ker A} \nabla f
\qquad
\nabla^2 \tilde{f} = \proj_{\ker A} \circ \nabla^2 f \circ \proj_{\ker A}
\]
\[
\mathcal{N}_{f}x := x - {(\proj_{\ker A} \circ \nabla^2 f(x) \circ \proj_{\ker A})}^{-1} \proj_{\ker A} \nabla f(x)
\]
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$
Let $\mathcal{A} := \{x ~|~ Ax = b\}$
-
$\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$,
\[
\mathcal{N}_{\mathcal{H}F}x = \left \{
\begin{aligned}
\proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),&
\proj_{\ker A}H\proj_{\ker A}d = \\
&-\proj_{\ker A}F(\proj_{\mathcal{A}}x)
\end{aligned}\right\}\]
is the Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the
Newton-type method
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$
Let $\mathcal{A} := \{x ~|~ Ax = b\}$
-
$\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$,
\[
\mathcal{N}_{\mathcal{H}F}x = \left \{
\begin{aligned}
\proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),&
\proj_{\ker A}H\proj_{\ker A}d = \\
&-\proj_{\ker A}F(\proj_{\mathcal{A}}x)
\end{aligned}\right\}\]
is the Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the
Newton-type method
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$
Let $\mathcal{A} := \{x ~|~ Ax = b\}$
-
$\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$,
\[
\mathcal{N}_{\mathcal{H}F}x = \left \{
\begin{aligned}
\proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),&
\proj_{\ker A}H\proj_{\ker A}d = \\
&-\proj_{\ker A}F(\proj_{\mathcal{A}}x)
\end{aligned}\right\}\]
is the Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the
Newton-type method
- $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- $\proj_{\ker A} F(\xbar) = 0$
- $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|{\proj_{\ker A}H\proj_{\ker A}}^{-1}\| \le \Omega$
Then any Newton-type method converges superlinearly for any $x^0$ close
enough to $\xbar$
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$
Let $\mathcal{A} := \{x ~|~ Ax = b\}$
- $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
- $\proj_{\ker A} F(\xbar) = 0$
- $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|{\proj_{\ker A}H\proj_{\ker A}}^{-1}\| \le \Omega$
Then any Newton-type method converges superlinearly for any $x^0$
close
enough to $\xbar$
-
$\mathcal{N}_{\mathcal{H}G}:\bbRn \setto \bbRn$,
\[
\mathcal{N}_{\mathcal{H}G}x = \left \{
\begin{aligned}
\proj_{\mathcal{A}(x, H)}x~|~&H \in \mathcal{H}G(x), \mathcal{A}(x, H) \\
&= \{y~|~G(x) + H(y - x) = 0\}
\end{aligned}\right\}\]
is the Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}G}x^k$ is the
Newton-type method
- $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
-
$\gdef\bbRm{\mathbb{R}^m}G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
-
$\mathcal{H}G$ is linearly geometrically compatible
Then the Newton-type method
approaches $\mathcal{Z}$
superlinearly
for any $x^0$ close
enough to $\mathcal{Z}$
\[
\minimize{x \in \bbRn}{f(x)}{G(x) = 0}
\]
First order optimality:
\[
\proj_{\ker \nabla G(\xbar)}\nabla f(\xbar) = 0
\]
Problem:
Given $F$, $\mathcal{H}F$ and $G$, $\mathcal{H}G$, find $\xbar$ with
\[
\sup_{H_G \in \mathcal{H}G(\xbar)}\proj_{\ker H_G}F(\xbar) = 0
\]
The algorithm for linear equality constraints
\[
\mathcal{N}_{\mathcal{H}F}x = \{\proj_{\mathcal{A}}x + d~|~H_F \in \mathcal{H}F(\proj_{\mathcal{A}}x),
\proj_{\ker A}H_F\proj_{\ker A}d = -\proj_{\ker A}F(\proj_{\mathcal{A}}x)\}
\]
-
$\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}:\bbRn \setto \bbRn$,
\[\begin{aligned}
\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x = \{\proj_{\mathcal{A}(x, H_G)}x + d~|~H_G \in \mathcal{H}G(x),H_F \in \mathcal{H}F(\proj_{\mathcal{A}(x, H_G)}x),
\\
\proj_{\ker H_G}H_F\proj_{\ker H_G}d = -\proj_{\ker H_G}F(\proj_{\mathcal{A}(x, H_G)}x)\}
\end{aligned}\]
is the
Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x^k$ is the
Newton-type method
using LinearAlgebra
using Kyrilov
function step(x, F, H, G, J)
Jₓ = J(x)
J⁺ = pinv(Jₓ)
xᴾ = x - J⁺ * G(x)
P = I - J⁺ * Jₓ
b = P * F(xᴾ)
A = P * H(xᴾ) * P
d = Kyrilov.minres(A, -b)
return xᴾ + d
end
-
$G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\ker \Hbar} F(\xbar) = 0$
-
$\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
-
$G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
-
$\mathcal{H}G$ is linearly geometrically compatible
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)}
\|{(\proj_{\ker H_G}H_F\proj_{\ker H_G})}^{-1}\| \le \Omega$
-
$\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\ker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
Then any Newton-type method converges
superlinearly
for any $x^0$ close
enough to $\bar{x}$
\[\minimize{x,y \in \bbR}{{(x - 2)}^2 + {(y - 2)}^2}{x^2 + y^2 - 1 = 0}\]
\[\minimize{x,y \in \bbR}{{(y - x^2)}^2 + {(1 - x)}^2}{x^2 + y^2 - 1 = 0} \]
\[\gdef\sker{\operatorname{sker}}
\minimize{x \in \bbRn}{f(x)}{G(x) \preceq 0}
\]
Mutatis mutandis:
\[\sker A = \{x~|~ Ax \preceq 0\}, \quad
\mathcal{B}(x, H) := \{y~|~G(x) + H(y - x) \preceq 0\}\]
First order optimality:
\[
\proj_{\sker \nabla G(\xbar)}\nabla f(\xbar) = 0
\]
Solution:
$\xbar$ such that $G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$
-
$\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}:\bbRn \setto \bbRn$,
\[\begin{aligned}
\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x = \{\proj_{\mathcal{B}(x, H_G)}x + d~|~H_G \in \mathcal{H}G(x),H_F \in \mathcal{H}F(\proj_{\mathcal{B}(x, H_G)}x), \\
\proj_{\sker H_G}H_F\proj_{\sker H_G}d = -\proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x)\}
\end{aligned}\]
is the Newton operator
-
${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x^k$ is the Newton-type method
-
$G(\xbar) \preceq 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$
-
$\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
-
$G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
-
$\mathcal{H}G$ is linearly geometrically compatible
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)}
\|{(\proj_{\sker H_G}H_F\proj_{\sker H_G})}^{-1}\| \le \Omega$
-
$\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\sker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
- ${\lim_{x \to \xbar}
\frac{\| \proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x) - \proj_{\sker H_G}F(\xbar)
- (\proj_{\sker H_G} H \proj_{\sker H_G})(x, \xbar) \|}{\|x - \xbar\|} = 0}$
Then any Newton-type method converges
superlinearly
for any $x^0$ close
enough to $\bar{x}$
-
$G(\xbar) \preceq 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$
-
$\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
-
$G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
-
$\mathcal{H}G$ is linearly geometrically compatible
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)}
\|{(\proj_{\sker H_G}H_F\proj_{\sker H_G})}^{-}\| \le \Omega$
-
$\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\sker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
- ${\lim_{x \to \xbar}
\frac{\| \proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x) - \proj_{\sker H_G}F(\xbar)
- (\proj_{\sker H_G} H \proj_{\sker H_G})(x, \xbar) \|}{\|x - \xbar\|} = 0}$
Then any Newton-type method converges
superlinearly
for any $x^0$ close
enough to $\bar{x}$
\[\minimize{x,y \in \bbR}{{(x - 1)}^2 + {y}^2}{(x-1)^2 + (y-1)^2 - 1 \le 0} \]
\[\minimize{x,y \in \bbR}{{(y - x^2)}^2 + {(1 - x)}^2}{(x-1)^2 + (y-1)^2 - 1 \le 0} \]
Regularity: Newton Differentiability
$+$
Operator Splitting
$\Rightarrow$
Constrained Optimization
-
$G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\ker \Hbar} F(\xbar) = 0$
-
$\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
-
$G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
-
$\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
-
$\mathcal{H}G$ is linearly geometrically compatible
-
$F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
-
$\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)}
\|{(\proj_{\ker H_G}H_F\proj_{\ker H_G})}^{-1}\| \le \Omega$
-
$\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\ker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
Then any Newton-type method converges
superlinearly
for any $x^0$ close
enough to $\bar{x}$