跳到主要内容

3 Implicit Function and Inverse Mapping Theorem

3.1 Implicit Function Theorem

Theorem 3.1 (Implicit Function Theorem). Let F:WRnF: W \to \mathbb{R}^n be a Ck\mathscr{C}^k differentiable mapping on an open set WRm×RnW \subseteq \mathbb{R}^m \times \mathbb{R}^n. Suppose (x0,y0)W(\mathbf{x}_0, \mathbf{y}_0) \in W satisfies F(x0,y0)=0F(\mathbf{x}_0, \mathbf{y}_0) = 0 and the partial derivative yF(x0,y0)\partial_{\mathbf{y}} F(\mathbf{x}_0, \mathbf{y}_0) is invertible.

Then there exist a neighborhood URmU \subseteq \mathbb{R}^m of x0\mathbf{x}_0, a neighborhood VRnV \subseteq \mathbb{R}^n of y0\mathbf{y}_0, and a Ck\mathscr{C}^k mapping f:URnf: U \to \mathbb{R}^n such that:

(1) For all xU\mathbf{x} \in U,

F(x,f(x))=0,\begin{align*} F(\mathbf{x}, f(\mathbf{x})) = 0, \end{align*}

and

(2) If (x,y)U×V(\mathbf{x}, \mathbf{y}) \in U \times V satisfies F(x,y)=0F(\mathbf{x}, \mathbf{y}) = 0, then

y=f(x).\begin{align*} \mathbf{y} = f(\mathbf{x}). \end{align*}

3.2 Inverse Mapping Theorem

Theorem 3.2 (Inverse Mapping Theorem). Let f:URnf: U \to \mathbb{R}^n be Ck\mathscr{C}^k differentiable on an open set UU. If df(x0)\mathrm{d}f(\mathbf{x}_0) is invertible, then there exist a neighborhood VV of x0\mathbf{x}_0 and a neighborhood WW of f(x0)f(\mathbf{x}_0) such that

f:VW\begin{align*} f: V \to W \end{align*}

has a Ck\mathscr{C}^k inverse mapping.

Definition. A mapping f:UVf: U \to V is called a Ck\mathscr{C}^k diffeomorphism if ff is a Ck\mathscr{C}^k mapping and there exists a Ck\mathscr{C}^k inverse mapping g:VUg: V \to U, such that

g(f(x))=x,f(g(y))=y,xU,yV.\begin{align*} g(f(\mathbf{x})) = \mathbf{x}, \quad f(g(\mathbf{y})) = \mathbf{y}, \quad \forall \mathbf{x} \in U, \mathbf{y} \in V. \end{align*}

Theorem 3.3. Let URnU \subseteq \mathbb{R}^n be an open set, and let f:URnf: U \to \mathbb{R}^n be a Ck\mathscr{C}^k mapping. Then

V=f(U) is an open set and f:UV is a Ck diffeomorphism\begin{align*} V = f(U) \text{ is an open set and } f: U \to V \text{ is a } \mathscr{C}^k \text{ diffeomorphism} \end{align*}

if and only if

f is injective, and for every xU,df(x) is an invertible linear mapping.\begin{align*} f \text{ is injective, and for every } \mathbf{x} \in U, \mathrm{d}f(\mathbf{x}) \text{ is an invertible linear mapping.} \end{align*}

3.4 Surfaces, Tangent Planes and Normal Vectors

Surface

Let 1mn11 \leq m \leq n-1. We say that ΣRn\Sigma \subset \mathbb{R}^n is an mm-dimensional Ck\mathscr{C}^k surface, if for any x0Σ\mathbf{x}_0 \in \Sigma, Σ\Sigma is the graph of an mm-variable Ck\mathscr{C}^k mapping in a neighborhood of x0\mathbf{x}_0. That is, there exists a neighborhood UU of x0\mathbf{x}_0, a permutation σ(1),σ(2),,σ(n)\sigma(1), \sigma(2), \dots, \sigma(n) of 1,2,,n1, 2, \dots, n, and a Ck\mathscr{C}^k mapping gg such that for any x=(x1,x2,,xn)U\mathbf{x} = (x^1, x^2, \dots, x^n) \in U,

x=(x1,x2,,xn)Σ    (xσ(m+1),,xσ(n))=g(xσ(1),,xσ(m)).\begin{align*} \mathbf{x} = (x^1, x^2, \dots, x^n) \in \Sigma \iff (x^{\sigma(m+1)}, \dots, x^{\sigma(n)}) = g(x^{\sigma(1)}, \dots, x^{\sigma(m)}). \end{align*}

When m=n1m = n-1, Σ\Sigma is called a Ck\mathscr{C}^k hypersurface.

When m=1m = 1, Σ\Sigma is a Ck\mathscr{C}^k curve.

Theorem 3.4.

  1. If aa is a regular value of a Ck\mathscr{C}^k mapping F:URnF: U \to \mathbb{R}^n (where URmU \subseteq \mathbb{R}^m is a region), meaning that for any xU\mathbf{x} \in U satisfying F(x)=aF(\mathbf{x}) = a, dF(x)\mathrm{d}F(\mathbf{x}) has full row rank, then the level set of FF
Σ={xUF(x)=a}\begin{align*} \Sigma = \{\mathbf{x} \in U \mid F(\mathbf{x}) = a\} \end{align*}

is an (mn)(m - n)-dimensional Ck\mathscr{C}^k surface in Rm\mathbb{R}^m.

  1. Let URmU \subseteq \mathbb{R}^m be an open set. A Ck\mathscr{C}^k injective mapping f:URnf: U \to \mathbb{R}^n (x=f(u1,,um)\mathbf{x} = f(u^1, \dots, u^m)) satisfies for any uU\mathbf{u} \in U, rank(df(u))=m\text{rank}(\mathrm{d}f(\mathbf{u})) = m, i.e., df(u)\mathrm{d}f(\mathbf{u}) has full column rank. Then
f(U)Rn\begin{align*} f(U) \subseteq \mathbb{R}^n \end{align*}

is an mm-dimensional Ck\mathscr{C}^k surface in Rn\mathbb{R}^n. ff is called a Ck\mathscr{C}^k parametric representation of this surface.

Tangent Space

Let Σ\Sigma be a Ck\mathscr{C}^k surface in Rn\mathbb{R}^n, and x0Σ\mathbf{x}_0 \in \Sigma. Let γ:(a,b)Rn\gamma: (a, b) \to \mathbb{R}^n be a C1\mathscr{C}^1 curve on Σ\Sigma passing through x0\mathbf{x}_0, meaning that for any t(a,b)t \in (a, b), γ(t)Σ\gamma(t) \in \Sigma, and γ(t0)=x0\gamma(t_0) = \mathbf{x}_0. In this case, γ(t0)Rn\gamma'(t_0) \in \mathbb{R}^n is called a tangent vector of Σ\Sigma at x0\mathbf{x}_0. Let Tx0ΣT_{\mathbf{x}_0}\Sigma denote the set consisting of all tangent vectors of Σ\Sigma at x0\mathbf{x}_0, which is called the tangent space of Σ\Sigma at x0\mathbf{x}_0.

The tangent plane of the surface Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma is

x0+Tx0Σ.\begin{align*} \mathbf{x}_0 + T_{\mathbf{x}_0}\Sigma. \end{align*}

The normal space of the surface Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma is

(Tx0Σ),\begin{align*} (T_{\mathbf{x}_0}\Sigma)^\perp, \end{align*}

where the vectors within this set are called normal vectors of Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma.

The normal line / normal plane of the surface Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma is

x0+(Tx0Σ).\begin{align*} \mathbf{x}_0 + (T_{\mathbf{x}_0}\Sigma)^\perp. \end{align*}

Theorem 3.5. Let ΣRn\Sigma \subseteq \mathbb{R}^n be an mm-dimensional Ck\mathscr{C}^k surface. Then for any x0Σ\mathbf{x}_0 \in \Sigma, the tangent space Tx0ΣT_{\mathbf{x}_0}\Sigma is an mm-dimensional linear subspace of Rn\mathbb{R}^n, and

  1. If Σ\Sigma is a regular level set of a Ck\mathscr{C}^k mapping FF, Σ={xRnF(x)=a}\Sigma = \{\mathbf{x} \in \mathbb{R}^n \mid F(\mathbf{x}) = a\}, then
Tx0Σ=KerdF(x0)={vRndF(x0)(v)=0}\begin{align*} T_{\mathbf{x}_0}\Sigma &= \text{Ker} \, \mathrm{d}F(\mathbf{x}_0) \\ &= \{\mathbf{v} \in \mathbb{R}^n \mid \mathrm{d}F(\mathbf{x}_0)(\mathbf{v}) = \mathbf{0}\} \end{align*}

Equation of the tangent plane:

dF(x0)(xx0)=0\begin{align*} \mathrm{d}F(\mathbf{x}_0)(\mathbf{x} - \mathbf{x}_0) = \mathbf{0} \end{align*}

The gradient vector F(x0)\nabla F(\mathbf{x}_0) is a normal vector of the surface Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma.

  1. If f:URnf: U \to \mathbb{R}^n is a Ck\mathscr{C}^k parametric representation of Σ\Sigma, then for x0=f(u0)\mathbf{x}_0 = f(\mathbf{u}_0),
Tx0Σ=Rangedf(u0)={df(u0)(v)RnvRm}\begin{align*} T_{\mathbf{x}_0}\Sigma &= \text{Range} \, \mathrm{d}f(\mathbf{u}_0) \\ &= \{\mathrm{d}f(\mathbf{u}_0)(\mathbf{v}) \in \mathbb{R}^n \mid \mathbf{v} \in \mathbb{R}^m\} \end{align*}

Equation of the tangent plane:

x=x0+df(u0)(v),vRm\begin{align*} \mathbf{x} = \mathbf{x}_0 + \mathrm{d}f(\mathbf{u}_0)(\mathbf{v}), \quad \mathbf{v} \in \mathbb{R}^m \end{align*}

The normal space of the surface Σ\Sigma at x0Σ\mathbf{x}_0 \in \Sigma is Ker((df(u0))T)\text{Ker}((\mathrm{d}f(\mathbf{u}_0))^T).

3.5 Application: Constrained Extremum and Lagrange Multiplier Method

Since the constraint g=0g = 0 is usually not easy to solve, Lagrange proposed considering the following augmented function:

F(x,λ1,,λr)=f(x)λ1g1(x)λrgr(x).\begin{align*} F(\mathbf{x}, \lambda_1, \dots, \lambda_r) = f(\mathbf{x}) - \lambda_1 g_1(\mathbf{x}) - \dots - \lambda_r g_r(\mathbf{x}). \end{align*}

The critical points of FF satisfy:

{xF(x,λ1,,λr)=f(x)λ1g1(x)λrgr(x)=0,Fλi(x,λ1,,λr)=gi(x)=0,i=1,,r.\begin{align*} \begin{cases} \nabla_{\mathbf{x}} F(\mathbf{x}, \lambda_1, \dots, \lambda_r) = \nabla f(\mathbf{x}) - \lambda_1 \nabla g_1(\mathbf{x}) - \dots - \lambda_r \nabla g_r(\mathbf{x}) = 0, \\ \frac{\partial F}{\partial \lambda_i}(\mathbf{x}, \lambda_1, \dots, \lambda_r) = -g_i(\mathbf{x}) = 0, \quad i = 1, \dots, r. \end{cases} \end{align*}

The first equation corresponds to the necessary condition for a constrained extremum point, while the subsequent equations are precisely the constraints themselves. In this way, the constrained extremum problem is transformed into an unconstrained extremum problem for the augmented function FF. This method is called the "Lagrange Multiplier Method," where λ\lambda is called the Lagrange Multiplier.

Theorem 3.6. Let ff be a Cr\mathscr{C}^r function, and let g:RmRng: \mathbb{R}^m \to \mathbb{R}^n (m>nm > n) be a Cr\mathscr{C}^r mapping. Suppose x\mathbf{x}^* satisfies g(x)=0g(\mathbf{x}^*) = 0 and the Jacobian matrix Jg(x)Jg(\mathbf{x}^*) has full rank (i.e., n=rank dg(x)n = \text{rank } \mathrm{d}g(\mathbf{x}^*)). Then:

  1. (Fermat's Lemma, Necessary Condition for Constrained Extremum): If ff attains a (constrained) extremum at x\mathbf{x}^* under the constraint g(x)=0g(\mathbf{x}) = 0, then the gradient f(x)\nabla f(\mathbf{x}^*) is orthogonal to the tangent space of the surface Σ:g(x)=0\Sigma : g(\mathbf{x}) = 0 at x\mathbf{x}^*. In this case, there exists a unique set of real numbers λ1,,λn\lambda_1^*, \dots, \lambda_n^* such that:
f(x)=λ1g1(x)++λngn(x)\begin{align*} \nabla f(\mathbf{x}^*) = \lambda_1^* \nabla g_1(\mathbf{x}^*) + \dots + \lambda_n^* \nabla g_n(\mathbf{x}^*) \end{align*}
  1. (Sufficient Condition for Constrained Extremum): If the Hessian matrix of f(x)λ1g1(x)λngn(x)f(\mathbf{x}) - \lambda_1^* g_1(\mathbf{x}) - \dots - \lambda_n^* g_n(\mathbf{x}) with respect to x\mathbf{x} is positive definite (or negative definite) when restricted to the tangent space TxΣT_{\mathbf{x}^*} \Sigma, then x\mathbf{x}^* is a constrained local minimum point (or respectively, a constrained local maximum point) of ff restricted to Σ\Sigma.

  2. (Saddle Point): If the Hessian matrix of f(x)λ1g1(x)λngn(x)f(\mathbf{x}) - \lambda_1^* g_1(\mathbf{x}) - \dots - \lambda_n^* g_n(\mathbf{x}) with respect to x\mathbf{x}, when restricted to the tangent space TxΣT_{\mathbf{x}^*} \Sigma, has both positive and negative eigenvalues, then x\mathbf{x}^* is not a constrained extremum point of ff restricted to Σ\Sigma.