3 Implicit Function and Inverse Mapping Theorem
3.1 Implicit Function Theorem
Theorem 3.1 (Implicit Function Theorem). Let F:W→Rn be a Ck differentiable mapping on an open set W⊆Rm×Rn. Suppose (x0,y0)∈W satisfies F(x0,y0)=0 and the partial derivative ∂yF(x0,y0) is invertible.
Then there exist a neighborhood U⊆Rm of x0, a neighborhood V⊆Rn of y0, and a Ck mapping f:U→Rn such that:
(1) For all x∈U,
F(x,f(x))=0,
and
(2) If (x,y)∈U×V satisfies F(x,y)=0, then
y=f(x).
3.2 Inverse Mapping Theorem
Theorem 3.2 (Inverse Mapping Theorem). Let f:U→Rn be Ck differentiable on an open set U. If df(x0) is invertible, then there exist a neighborhood V of x0 and a neighborhood W of f(x0) such that
f:V→W
has a Ck inverse mapping.
Definition. A mapping f:U→V is called a Ck diffeomorphism if f is a Ck mapping and there exists a Ck inverse mapping g:V→U, such that
g(f(x))=x,f(g(y))=y,∀x∈U,y∈V.
Theorem 3.3. Let U⊆Rn be an open set, and let f:U→Rn be a Ck mapping. Then
V=f(U) is an open set and f:U→V is a Ck diffeomorphism
if and only if
f is injective, and for every x∈U,df(x) is an invertible linear mapping.
3.4 Surfaces, Tangent Planes and Normal Vectors
Surface
Let 1≤m≤n−1. We say that Σ⊂Rn is an m-dimensional Ck surface, if for any x0∈Σ, Σ is the graph of an m-variable Ck mapping in a neighborhood of x0. That is, there exists a neighborhood U of x0, a permutation σ(1),σ(2),…,σ(n) of 1,2,…,n, and a Ck mapping g such that for any x=(x1,x2,…,xn)∈U,
x=(x1,x2,…,xn)∈Σ⟺(xσ(m+1),…,xσ(n))=g(xσ(1),…,xσ(m)).
When m=n−1, Σ is called a Ck hypersurface.
When m=1, Σ is a Ck curve.
Theorem 3.4.
- If a is a regular value of a Ck mapping F:U→Rn (where U⊆Rm is a region), meaning that for any x∈U satisfying F(x)=a, dF(x) has full row rank, then the level set of F
Σ={x∈U∣F(x)=a}
is an (m−n)-dimensional Ck surface in Rm.
- Let U⊆Rm be an open set. A Ck injective mapping f:U→Rn (x=f(u1,…,um)) satisfies for any u∈U, rank(df(u))=m, i.e., df(u) has full column rank. Then
f(U)⊆Rn
is an m-dimensional Ck surface in Rn. f is called a Ck parametric representation of this surface.
Tangent Space
Let Σ be a Ck surface in Rn, and x0∈Σ. Let γ:(a,b)→Rn be a C1 curve on Σ passing through x0, meaning that for any t∈(a,b), γ(t)∈Σ, and γ(t0)=x0. In this case, γ′(t0)∈Rn is called a tangent vector of Σ at x0. Let Tx0Σ denote the set consisting of all tangent vectors of Σ at x0, which is called the tangent space of Σ at x0.
The tangent plane of the surface Σ at x0∈Σ is
x0+Tx0Σ.
The normal space of the surface Σ at x0∈Σ is
(Tx0Σ)⊥,
where the vectors within this set are called normal vectors of Σ at x0∈Σ.
The normal line / normal plane of the surface Σ at x0∈Σ is
x0+(Tx0Σ)⊥.
Theorem 3.5. Let Σ⊆Rn be an m-dimensional Ck surface. Then for any x0∈Σ, the tangent space Tx0Σ is an m-dimensional linear subspace of Rn, and
- If Σ is a regular level set of a Ck mapping F, Σ={x∈Rn∣F(x)=a}, then
Tx0Σ=KerdF(x0)={v∈Rn∣dF(x0)(v)=0}
Equation of the tangent plane:
dF(x0)(x−x0)=0
The gradient vector ∇F(x0) is a normal vector of the surface Σ at x0∈Σ.
- If f:U→Rn is a Ck parametric representation of Σ, then for x0=f(u0),
Tx0Σ=Rangedf(u0)={df(u0)(v)∈Rn∣v∈Rm}
Equation of the tangent plane:
x=x0+df(u0)(v),v∈Rm
The normal space of the surface Σ at x0∈Σ is Ker((df(u0))T).
3.5 Application: Constrained Extremum and Lagrange Multiplier Method
Since the constraint g=0 is usually not easy to solve, Lagrange proposed considering the following augmented function:
F(x,λ1,…,λr)=f(x)−λ1g1(x)−⋯−λrgr(x).
The critical points of F satisfy:
{∇xF(x,λ1,…,λr)=∇f(x)−λ1∇g1(x)−⋯−λr∇gr(x)=0,∂λi∂F(x,λ1,…,λr)=−gi(x)=0,i=1,…,r.
The first equation corresponds to the necessary condition for a constrained extremum point, while the subsequent equations are precisely the constraints themselves. In this way, the constrained extremum problem is transformed into an unconstrained extremum problem for the augmented function F. This method is called the "Lagrange Multiplier Method," where λ is called the Lagrange Multiplier.
Theorem 3.6. Let f be a Cr function, and let g:Rm→Rn (m>n) be a Cr mapping. Suppose x∗ satisfies g(x∗)=0 and the Jacobian matrix Jg(x∗) has full rank (i.e., n=rank dg(x∗)). Then:
- (Fermat's Lemma, Necessary Condition for Constrained Extremum): If f attains a (constrained) extremum at x∗ under the constraint g(x)=0, then the gradient ∇f(x∗) is orthogonal to the tangent space of the surface Σ:g(x)=0 at x∗. In this case, there exists a unique set of real numbers λ1∗,…,λn∗ such that:
∇f(x∗)=λ1∗∇g1(x∗)+⋯+λn∗∇gn(x∗)
-
(Sufficient Condition for Constrained Extremum): If the Hessian matrix of f(x)−λ1∗g1(x)−⋯−λn∗gn(x) with respect to x is positive definite (or negative definite) when restricted to the tangent space Tx∗Σ, then x∗ is a constrained local minimum point (or respectively, a constrained local maximum point) of f restricted to Σ.
-
(Saddle Point): If the Hessian matrix of f(x)−λ1∗g1(x)−⋯−λn∗gn(x) with respect to x, when restricted to the tangent space Tx∗Σ, has both positive and negative eigenvalues, then x∗ is not a constrained extremum point of f restricted to Σ.