A Newton-type Method for Constrained Optimization

Titus Pința

Newton differentiability subsumes all aspects of Newton's method

  • First Order Methods
  • Stochastic Optimization
  • Quasi-Newton Methods
  • Equations on Metric Spaces
  • Constrained Optimization
  • \(\gdef\bbRn{\mathbb{R}^n}\mathcal{N}_{\mathcal{H}F}:\bbRn \rightrightarrows \bbRn\), \[ \mathcal{N}_{\mathcal{H}F} x = \{x + d~|~H \in \mathcal{H}F(x), Hd = -F(x)\}, \] is the Newton operator
  • \(\gdef\bbN{\mathbb{N}}{\{x^k\}}_{k \in \bbN}\), \(x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k\) is the Newton-type method
  • $\gdef\xbar{\bar{x}}F(\xbar) = 0$
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$close enough to $\xbar$

  • $\gdef\xbar{\bar{x}}F(\xbar) = 0$
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$close enough to $\xbar$

  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  1. $F$ is $\mathcal{C^{1}}$, $\mathcal{H}F = \nabla F$
  2. $F$ is semismooth with $\mathcal{H}F = F'$
  3. $F$ is tame (definable in an $o$-minimal structure) with $\mathcal{H}F = F'$
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  1. 1 \[ F(x, y) = \begin{bmatrix} 4(x^2 - y) + 2(x - 1) \\ -2(x^2 - y) \end{bmatrix} \quad \mathcal{H}F(x, y) = \left\{2 \begin{bmatrix} 1 + 4 x^2 & -2 x \\ -2 x & 1 \end{bmatrix}\right\} \]
    1 R. Bergmann, O. P. Ferreira, E. M. Santos, and J. C. O. Souza. "The difference of convex algorithm on Hadamard manifolds". In: J. Optim. Theory Appl. 201 (2021), pp. 221–251
  2. \[\gdef\Id{\operatorname{Id}} F(x) = \left \{ \begin{array}{rl} x & \quad x \in \mathbb{Q}, \\ -x & \quad x \in \bbR \setminus \mathbb{Q} \end{array} \right. \quad \mathcal{H}F(x) = \left \{ \begin{array}{rl} 1 & \quad x \in \mathbb{Q}, \\ -1 & \quad x \in \bbR \setminus \mathbb{Q} \end{array} \right. \]
  3. $F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
    weakly Newton differentiable
    )
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  1. 1 \[ F(x, y) = \begin{bmatrix} 4(x^2 - y) + 2(x - 1) \\ -2(x^2 - y) \end{bmatrix} \quad \mathcal{H}F(x, y) = \left\{2 \begin{bmatrix} 1 + 4 x^2 & -2 x \\ -2 x & 1 \end{bmatrix}\right\} \]
    1 R. Bergmann, O. P. Ferreira, E. M. Santos, and J. C. O. Souza. "The difference of convex algorithm on Hadamard manifolds". In: J. Optim. Theory Appl. 201 (2021), pp. 221–251
  2. \[\gdef\Id{\operatorname{Id}} F(x) = \left \{ \begin{array}{rl} x & \quad x \in \mathbb{Q}, \\ -x & \quad x \in \bbR \setminus \mathbb{Q} \end{array} \right. \quad \mathcal{H}F(x) = \left \{ \begin{array}{rl} 1 & \quad x \in \mathbb{Q}, \\ -1 & \quad x \in \bbR \setminus \mathbb{Q} \end{array} \right. \]
  3. $F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
    weakly Newton differentiable
    )
  1. $F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$ (then $F$ is
    weakly Newton differentiable
    )
$F:\bbRn \to \bbRn$ is weakly Newton differentiable at $\xbar$: $\exists\gdef\setto{\rightrightarrows}\gdef\bbRnxn{\mathbb{R}^{n \times n}}\mathcal{H}F:\bbRn \setto \bbRnxn$ \[ \lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\frac{\|F(x) - F(\xbar) - H(x - \xbar)\|}{\|x - \xbar\|} < \infty \]
  • $F(\xbar) = 0$
  • $F:\bbRn \to \bbRn$ is weakly Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$
  • $\Omega\lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\|F(x) - F(\xbar) - H(x - \xbar)\| \le \|x - \xbar\|$

Then any Newton-type method converges linearly for any $x^0$ close enough to $\xbar$

  1. $F$ is Lipschitz, with $\mathcal{H}F = \alpha\Id$

For $F = \nabla f$, then $\mathcal{N}_{\alpha \Id}x = x - \alpha^{-1} \nabla f(x)$

If, either

  • $\nabla f$ Lipschitz, hypo-monotone and metric subregular
  • $f$ has the KŁ property and ${\{x^k\}}_{k \in \bbN}$ has sufficient decrease and a gradient lower bound

then $\nabla f$ is weakly Newton differentiable and Newton's method converges linearly

  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $F(\xbar) = 0$
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$ close enough to $\xbar$

  • $F(\xbar) = 0$
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|H^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$ close enough to $\xbar$

  • $F(\xbar) = 0$
  • If $F$ is Fréchet differentiable on a set, with $\nabla F$ Lipschitz continuous, \[\|\mathcal{H}F(x) - \mathcal{H}F(y)\| \le L\|x - y\|\vphantom{\|^{\gamma-1}} \]
  • $B := \|{\mathcal{H}F(x^0)}^{-1}\|$ and $\eta := \|{\mathcal{H}F(x^0)}^{-1}F(x^0)\|$,

    \[LB\eta \le \frac{1}{2} \]

then $\exists \xbar$ such that $F(\xbar) = 0$

  • $F(\xbar) = 0$
  • If $F$ is Newton differentiable on a set, with $\gdef\tbar{\bar{t}}\mathcal{H}F$ Hölder continuous, \[\|\mathcal{H}F(x) - \mathcal{H}F(y)\| \le L\|x - y\|^{\gamma-1}\]
  • $B := \|{\mathcal{H}F(x^0)}^{-1}\|$ and $\eta := \|{\mathcal{H}F(x^0)}^{-1}F(x^0)\|$,

    \[ \begin{aligned} &\frac{LB}{2}{\left(\frac{f(t)}{f'(t)}\right)}^\gamma \le f\left(t - \frac{f(t)}{f'(t)}\right), & f(0) = \eta,\quad f(t) > 0,\quad f(\tbar) = 0, \\ &f'(t) < 0,\quad f'(t) \ge {(1 + L t)}^{-1}, & f''(t) > 0 \end{aligned} \]'

then $\exists \xbar$ such that $F(\xbar) = 0$

Modified Newton Methods

$F$ is Newton differentiable with $\mathcal{H}F$:

  1. is Newton differentiable with $\mathcal{G}F(x) = \mathcal{H}F(x) + B_{\varepsilon(x)}[0]$
  2. is Newton differentiable with $\mathcal{G}F(x)=\mathcal{H} F(\phi(x))$ for $\lim_{x \to \xbar}\phi(x) \to \xbar$
  3. is Newton differentiable with any $\mathcal{G}F$ with \[\gdef\Hbar{\bar{H}} \lim_{x \to \xbar} \sup_{H \in \mathcal{H}F(x), G \in \mathcal{G}F(x)} \frac{\|H(x - \xbar) - G(x - \xbar)\|}{\|x - \xbar\|} = 0 \]
  4. is Newton differentiable for $H$ produced by a quasi-Newton update rule $\mathcal{U}:\bbRn \times \bbRnxn \to \bbRnxn$ with \[ \lim_{x, H \to \xbar, \Hbar} \frac{\|\mathcal{U}(x, H) - \Hbar\|_F}{\|H - \Hbar\|_F} = 0 \]

Stochastic Newton

In the setting of optimization $H$ is sampled randomly and we employ a backtracking strategy for sufficient decrease, i.e. \[f(x^k) - f(y) \ge c_k\|y - x^k\|\]

Assume for a given $x$, $\mathbb{P}(H \in \mathcal{H}F(x)) > 1 - \delta$ and let $K$ the number of iterations until $\gdef\bbE{\mathbb{E}}\nabla f(x^K) \le \varepsilon$.

Then \[\bbE(K) \le \frac{1}{1 - \delta}\left( \frac{f(x^0) - f(\xbar)}{\rho_{\min}{(d\varepsilon)}^2} + \frac{\log \frac{\rho_{\min}}{\rho^0}}{\log \alpha } \right) \]

In particular, for additive Gaussian noise \[\bbE(K) \le \frac{1}{1 - n^{-1}}\left(\frac{f(x^0) - f(\xbar)}{0.5(1-c){(\omega^{-1}\varepsilon)}^2} + \frac{\log \frac{0.5(1-c)}{\rho^0}}{\log \alpha } \right) \]

Quasi-Metric Spaces

$\gdef\bbM{\mathbb{M}}H:\bbM \times \bbM \to \bbRn$ is called pseudo-linear if $H(x, x) = 0$

$H^{-}:\bbM \times \bbRn \to \bbM$ is a quasi-inverse if $H^{-}(x, 0) = 0$

\[\gdef\dist{\operatorname{dist}} \begin{aligned} \exists \|H^{-}\| &\in [0, \infty) \\ &\dist(H^{-}(x, v), H^{-}(y, w)) \le \|H^{-}\|\|v - w - H(x, y)\| \end{aligned} \]

$F:\bbM \to \bbRn$ is Newton differentiable at $\xbar$ if there is $\mathcal{H}F$ with \[ \lim_{x \to \xbar}\sup_{H \in \mathcal{H}F(x)}\frac{\|F(x) - F(\xbar) - H(x, \xbar)\|}{\dist(x, \xbar)} = 0 \]

  1. Convergence Theorem
  2. Kantorovich Theorem

Constrained Optimization

Warm Up

\[ \gdef\minim{\operatorname{minimize}} \gdef\st{\operatorname{subject\ to}} \gdef\minimize#1#2#3{{\begin{array}{ll}% \displaystyle{\underset{#1}{\minim}}\quad& #2 \\% \st & #3% \end{array}}} \gdef\minprob#1#2{\begin{array}{ll}% \displaystyle{\minim_{#1}}\quad& #2 \\% \end{array}} \minimize{x \in \bbRn}{f(x)}{Ax = b}\]

Consider $\tilde{f}:\bbR^{n - m} := \ker A \to \bbR$, $\tilde{f} = f|_{Ax = b}$

Remains to relate $\nabla \tilde{f}$ and $\nabla^2 \tilde{f}$ to $f$

\[\gdef\proj{\operatorname{P}} \nabla \tilde{f} = \proj_{\ker A} \nabla f \qquad \nabla^2 \tilde{f} = \proj_{\ker A} \circ \nabla^2 f \circ \proj_{\ker A} \]

\[ \mathcal{N}_{f}x := x - {(\proj_{\ker A} \circ \nabla^2 f(x) \circ \proj_{\ker A})}^{-1} \proj_{\ker A} \nabla f(x) \]

Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$

Let $\mathcal{A} := \{x ~|~ Ax = b\}$

  • $\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$, \[ \mathcal{N}_{\mathcal{H}F}x = \left \{ \begin{aligned} \proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),& \proj_{\ker A}H\proj_{\ker A}d = \\ &-\proj_{\ker A}F(\proj_{\mathcal{A}}x) \end{aligned}\right\}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the Newton-type method
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$

Let $\mathcal{A} := \{x ~|~ Ax = b\}$

  • $\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$, \[ \mathcal{N}_{\mathcal{H}F}x = \left \{ \begin{aligned} \proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),& \proj_{\ker A}H\proj_{\ker A}d = \\ &-\proj_{\ker A}F(\proj_{\mathcal{A}}x) \end{aligned}\right\}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the Newton-type method
Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$

Let $\mathcal{A} := \{x ~|~ Ax = b\}$

  • $\mathcal{N}_{\mathcal{H}F}:\bbRn \setto \bbRn$, \[ \mathcal{N}_{\mathcal{H}F}x = \left \{ \begin{aligned} \proj_{\mathcal{A}}x + d~|~H \in \mathcal{H}F(\proj_{\mathcal{A}}x),& \proj_{\ker A}H\proj_{\ker A}d = \\ &-\proj_{\ker A}F(\proj_{\mathcal{A}}x) \end{aligned}\right\}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F}x^k$ is the Newton-type method
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\proj_{\ker A} F(\xbar) = 0$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|{\proj_{\ker A}H\proj_{\ker A}}^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$ close enough to $\xbar$

Problem setup:
Find $\xbar$ such that $F(\xbar) \perp \ker A$

Let $\mathcal{A} := \{x ~|~ Ax = b\}$

  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\proj_{\ker A} F(\xbar) = 0$
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}F(x)}\|{\proj_{\ker A}H\proj_{\ker A}}^{-1}\| \le \Omega$

Then any Newton-type method converges superlinearly for any $x^0$
close enough to $\xbar$

Another Warm Up

\[\minimize{x \in \bbRn}{0}{G(x) = 0}\]

The set $\mathcal{A}(x) = \{y~|~G(x) + \nabla G(x)(y - x) = 0\}$ is not a singleton

We can project \[ x^{k + 1} = \proj_{\mathcal{A}(x^k)}x^k \]

When $\nabla G(x)$ has full rank, \[ x^{k + 1} = x^k - {\nabla G(x^k)}^{+}G(x) \]

  • $\mathcal{N}_{\mathcal{H}G}:\bbRn \setto \bbRn$, \[ \mathcal{N}_{\mathcal{H}G}x = \left \{ \begin{aligned} \proj_{\mathcal{A}(x, H)}x~|~&H \in \mathcal{H}G(x), \mathcal{A}(x, H) \\ &= \{y~|~G(x) + H(y - x) = 0\} \end{aligned}\right\}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}G}x^k$ is the Newton-type method
  • $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
  • $\gdef\bbRm{\mathbb{R}^m}G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
  • $\mathcal{H}G$ is linearly geometrically compatible

Then the Newton-type method approaches $\mathcal{Z}$ superlinearly for any $x^0$ close enough to $\mathcal{Z}$

\[ \minimize{x \in \bbRn}{f(x)}{G(x) = 0} \]

First order optimality:

\[ \proj_{\ker \nabla G(\xbar)}\nabla f(\xbar) = 0 \]

Problem:

Given $F$, $\mathcal{H}F$ and $G$, $\mathcal{H}G$, find $\xbar$ with \[ \sup_{H_G \in \mathcal{H}G(\xbar)}\proj_{\ker H_G}F(\xbar) = 0 \]

The algorithm for linear equality constraints

\[ \mathcal{N}_{\mathcal{H}F}x = \{\proj_{\mathcal{A}}x + d~|~H_F \in \mathcal{H}F(\proj_{\mathcal{A}}x), \proj_{\ker A}H_F\proj_{\ker A}d = -\proj_{\ker A}F(\proj_{\mathcal{A}}x)\} \]
  • $\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}:\bbRn \setto \bbRn$, \[\begin{aligned} \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x = \{\proj_{\mathcal{A}(x, H_G)}x + d~|~H_G \in \mathcal{H}G(x),H_F \in \mathcal{H}F(\proj_{\mathcal{A}(x, H_G)}x), \\ \proj_{\ker H_G}H_F\proj_{\ker H_G}d = -\proj_{\ker H_G}F(\proj_{\mathcal{A}(x, H_G)}x)\} \end{aligned}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x^k$ is the Newton-type method

using LinearAlgebra
using Kyrilov

function step(x, F, H, G, J)
    Jₓ = J(x)
    J⁺ = pinv(Jₓ)
    xᴾ = x - J⁺ * G(x)
    P = I - J⁺ * Jₓ

    b = P * F(xᴾ)
    A = P * H(xᴾ) * P

    d = Kyrilov.minres(A, -b)

    return xᴾ + d
end
  
  • $G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\ker \Hbar} F(\xbar) = 0$
  • $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
  • $G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
  • $\mathcal{H}G$ is linearly geometrically compatible
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)} \|{(\proj_{\ker H_G}H_F\proj_{\ker H_G})}^{-1}\| \le \Omega$
  • $\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\ker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$

Then any Newton-type method converges superlinearly for any $x^0$ close
enough to $\bar{x}$

\[\minimize{x,y \in \bbR}{{(x - 2)}^2 + {(y - 2)}^2}{x^2 + y^2 - 1 = 0}\]

\[\minimize{x,y \in \bbR}{{(y - x^2)}^2 + {(1 - x)}^2}{x^2 + y^2 - 1 = 0} \]

\[\gdef\sker{\operatorname{sker}} \minimize{x \in \bbRn}{f(x)}{G(x) \preceq 0} \]

Mutatis mutandis:
\[\sker A = \{x~|~ Ax \preceq 0\}, \quad \mathcal{B}(x, H) := \{y~|~G(x) + H(y - x) \preceq 0\}\]
First order optimality:

\[ \proj_{\sker \nabla G(\xbar)}\nabla f(\xbar) = 0 \]

Solution:

$\xbar$ such that $G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$

  • $\mathcal{N}_{\mathcal{H}F, \mathcal{H}G}:\bbRn \setto \bbRn$, \[\begin{aligned} \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x = \{\proj_{\mathcal{B}(x, H_G)}x + d~|~H_G \in \mathcal{H}G(x),H_F \in \mathcal{H}F(\proj_{\mathcal{B}(x, H_G)}x), \\ \proj_{\sker H_G}H_F\proj_{\sker H_G}d = -\proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x)\} \end{aligned}\] is the Newton operator
  • ${\{x^k\}}_{k \in \bbN}$, $x^{k+1} \in \mathcal{N}_{\mathcal{H}F, \mathcal{H}G}x^k$ is the Newton-type method
  • $G(\xbar) \preceq 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$
  • $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
  • $G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
  • $\mathcal{H}G$ is linearly geometrically compatible
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)} \|{(\proj_{\sker H_G}H_F\proj_{\sker H_G})}^{-1}\| \le \Omega$
  • $\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\sker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
  • ${\lim_{x \to \xbar} \frac{\| \proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x) - \proj_{\sker H_G}F(\xbar) - (\proj_{\sker H_G} H \proj_{\sker H_G})(x, \xbar) \|}{\|x - \xbar\|} = 0}$

Then any Newton-type method converges superlinearly for any $x^0$ close
enough to $\bar{x}$

  • $G(\xbar) \preceq 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\sker \Hbar} F(\xbar) = 0$
  • $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
  • $G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
  • $\mathcal{H}G$ is linearly geometrically compatible
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)} \|{(\proj_{\sker H_G}H_F\proj_{\sker H_G})}^{-}\| \le \Omega$
  • $\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\sker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$
  • ${\lim_{x \to \xbar} \frac{\| \proj_{\sker H_G}F(\proj_{\mathcal{B}(x, H_G)}x) - \proj_{\sker H_G}F(\xbar) - (\proj_{\sker H_G} H \proj_{\sker H_G})(x, \xbar) \|}{\|x - \xbar\|} = 0}$

Then any Newton-type method converges superlinearly for any $x^0$ close
enough to $\bar{x}$

\[\minimize{x,y \in \bbR}{{(x - 1)}^2 + {y}^2}{(x-1)^2 + (y-1)^2 - 1 \le 0} \]

\[\minimize{x,y \in \bbR}{{(y - x^2)}^2 + {(1 - x)}^2}{(x-1)^2 + (y-1)^2 - 1 \le 0} \]

Regularity: Newton Differentiability

$+$

Operator Splitting

$\Rightarrow$

Constrained Optimization

  • $G(\xbar) = 0 $ and $\forall \Hbar \in \mathcal{H}G(\xbar), \proj_{\ker \Hbar} F(\xbar) = 0$
  • $\mathcal{Z} := \{x~|~G(x) = 0\} \ne \emptyset$, $\proj_{\mathcal{Z}}$ is Lipschitz
  • $G:\bbRn \to \bbRm$ is Newton differentiable near $\mathcal{Z}$ with $\mathcal{H}G$ and Lipschitz
  • $\sup_{x \in U \in \mathcal{V}(\xbar)} \sup_{H \in \mathcal{H}G(x)}\|H^{+}\| \le \Omega$ and $H$ full rank
  • $\mathcal{H}G$ is linearly geometrically compatible
  • $F:\bbRn \to \bbRn$ is Newton differentiable at $\xbar$ with $\mathcal{H}F$
  • $\sup_{x \in U} \sup_{H_F \in \mathcal{H}F(x), H_G \in \mathcal{H}G(x)} \|{(\proj_{\ker H_G}H_F\proj_{\ker H_G})}^{-1}\| \le \Omega$
  • $\displaystyle{\lim_{x \to \xbar}\sup_{H_G \in \mathcal{H}G(x)}\frac{\|\proj_{\ker H_G}F(\xbar)\|}{\|x - \xbar\|}} = 0$

Then any Newton-type method converges superlinearly for any $x^0$ close
enough to $\bar{x}$