跳到主要内容

6 Inner Product Spaces

6.1 Inner Products and Norms

Inner Product

Let VV be a vector space over FF. An inner product on VV is a function that assigns, to every ordered pair of vectors xx and yy in VV, a scalar in FF, denoted x,y\langle x, y\rangle, such that for all x,y,zVx, y, z \in V and all cFc \in F, the following hold:

(a)x+z,y=x,y+z,y.\text{(a)}\quad \langle x + z, y\rangle = \langle x, y\rangle + \langle z, y\rangle.

(b)cx,y=cx,y.\text{(b)}\quad \langle c x, y\rangle = c \langle x, y\rangle.

(c)x,y=y,x\text{(c)}\quad \overline{\langle x, y\rangle} = \langle y, x\rangle where the bar denotes complex conjugation.

(d)x,x>0if x0.\text{(d)}\quad \langle x, x\rangle > 0 \quad \text{if } x \ne 0.

  • We assume that all vector spaces are over the field FF, where FF denotes either R\mathbb{R} or C\mathbb{C}.
  • Note that (c) reduces to x,y=y,x\langle x, y\rangle = \langle y, x\rangle if F=RF = \mathbb{R}.

Theorem 6.1. Let VV be an inner product space. Then for x,y,zVx, y, z \in V and cFc \in F, the following statements are true.

(a)x,y+z=x,y+x,z.\text{(a)}\quad \langle x, y + z\rangle = \langle x, y\rangle + \langle x, z\rangle.

(b)x,cy=cx,y.\text{(b)}\quad \langle x, cy\rangle = \overline{c}\,\langle x, y\rangle.

(c)x,0=0,x=0.\text{(c)}\quad \langle x, 0\rangle = \langle 0, x\rangle = 0.

(d)x,x=0if and only ifx=0.\text{(d)}\quad \langle x, x\rangle = 0 \quad \text{if and only if} \quad x = 0.

(e)\text{(e)}\quadIf x,y=x,z\langle x, y\rangle = \langle x, z\rangle for all xVx \in V, then y=zy = z.

Definition. For x=(a1,a2,,an)x = (a_1, a_2, \ldots, a_n) and y=(b1,b2,,bn)y = (b_1, b_2, \ldots, b_n) in FnF^n, define

x,y=i=1naibi.\begin{equation*} \langle x, y\rangle = \sum_{i=1}^{n} a_i \overline{b_i}. \end{equation*}

The inner product is called the standard inner product on FnF^n.

  • When F=RF = \mathbb{R} the conjugations are not needed, and in early courses this standard inner product is usually called the dot product and is denoted by xyx \cdot y instead of x,y\langle x, y\rangle.

Norm

Let VV be an inner product space. For xVx \in V, we define the norm or length of xx by

x=x,x.\begin{equation*} \|x\| = \sqrt{\langle x, x\rangle}. \end{equation*}

Theorem 6.2. Let VV be an inner product space over FF. Then for all x,yVx, y \in V and cFc \in F, the following statements are true.

(a)cx=cx.\text{(a)}\quad \|cx\| = |c| \cdot \|x\|.

(b)x=0if and only ifx=0.\text{(b)}\quad \|x\| = 0 \quad \text{if and only if} \quad x = 0. In any case, x0\|x\| \ge 0.

(c)\text{(c)}\quad(Cauchy–Schwarz Inequality)

x,yxy.\begin{equation*} |\langle x, y\rangle| \le \|x\| \cdot \|y\|. \end{equation*}

(d)\text{(d)}\quad(Triangle Inequality)

x+yx+y.\begin{equation*} \|x + y\| \le \|x\| + \|y\|. \end{equation*}

Definition. Let AMm×n(F)A \in M_{m \times n}(F). We define the conjugate transpose or adjoint of AA to be the n×mn \times m matrix AA^{*} such that

(A)ij=Ajifor all i,j.\begin{equation*} (A^{*})_{ij} = \overline{A_{ji}} \quad \text{for all } i, j. \end{equation*}

Definition. Let V=Mn×n(F)V = M_{n \times n}(F), and define

A,B=tr(BA)\begin{equation*} \langle A, B\rangle = \operatorname{tr}(B^{*} A) \end{equation*}

for A,BVA, B \in V.

The inner product on Mn×n(F)M_{n \times n}(F) in this example is called the Frobenius inner product.

Definition. A vector space VV over FF endowed with a specific inner product is called an inner product space. If F=CF = \mathbb{C}, we call VV a complex inner product space, whereas if F=RF = \mathbb{R}, we call VV a real inner product space.

Orthogonal and Orthonormal

Let VV be an inner product space. Vectors xx and yy in VV are orthogonal (perpendicular) if x,y=0\langle x, y\rangle = 0.

A subset SS of VV is orthogonal if any two distinct vectors in SS are orthogonal.

  • A vector xx in VV is a unit vector if x=1\|x\| = 1.

Finally, a subset SS of VV is orthonormal if SS is orthogonal and consists entirely of unit vectors.

  • Note that if S={v1,v2,,vr}S = \{v_1, v_2, \ldots, v_r\} is orthonormal, then vi,vj=δij\langle v_i, v_j\rangle = \delta_{ij}, where δij\delta_{ij} denotes the Kronecker delta.

Definition. If xx is any nonzero vector, then

1xx\begin{equation*} \frac{1}{\|x\|} x \end{equation*}

is a unit vector. The process of multiplying a nonzero vector by the reciprocal of its length is called normalizing.

6.2 The Gram–Schmidt Orthogonalization Process and Orthogonal Complements

Definition. Let VV be an inner product space. A subset of VV is an orthonormal basis for VV if it is an ordered basis that is orthonormal.

Theorem 6.3. Let VV be an inner product space and S={v1,v2,,vk}S = \{v_1, v_2, \ldots, v_k\} be an orthogonal subset of VV consisting of nonzero vectors. If yspan(S)y \in \operatorname{span}(S), then

y=i=1ky,vivi2vi.\begin{equation*} y = \sum_{i=1}^{k} \frac{\langle y, v_i\rangle}{\|v_i\|^2}\, v_i. \end{equation*}

Corollary 1. Let VV be an inner product space, and let SS be an orthogonal subset of VV

consisting of nonzero vectors. Then SS is linearly independent.

Gram–Schmidt Process

Let VV be an inner product space and S={w1,w2,,wn}S = \{w_1, w_2, \ldots, w_n\} be a linearly independent subset of VV. Define S={v1,v2,,vn}S' = \{v_1, v_2, \ldots, v_n\}, where v1=w1v_1 = w_1 and

vk=wkj=1k1wk,vjvj2vjfor 2kn.\begin{equation*} v_k = w_k - \sum_{j=1}^{k-1} \frac{\langle w_k, v_j\rangle}{\|v_j\|^2} \, v_j \quad \text{for } 2 \le k \le n. \end{equation*}

Then SS' is an orthogonal set of nonzero vectors such that span(S)=span(S)\operatorname{span}(S') = \operatorname{span}(S).

Orthogonal Complement

Let SS be a nonempty subset of an inner product space VV. We define SS^{\perp} (read “SS perp”) to be the set of all vectors in VV that are orthogonal to every vector in SS; that is,

S={xV:x,y=0 for all yS}.\begin{equation*} S^{\perp} = \{ x \in V : \langle x, y\rangle = 0 \text{ for all } y \in S \}. \end{equation*}

The set SS^{\perp} is called the orthogonal complement of SS.

  • It is easily seen that SS^{\perp} is a subspace of VV for any subset SS of VV.

Theorem 6.4. Let VV be a nonzero finite-dimensional inner product space. Then VV has an orthonormal basis β\beta. Furthermore, if β={v1,v2,,vn}\beta = \{v_1, v_2, \ldots, v_n\} and xVx \in V, then

x=i=1nx,vivi.\begin{equation*} x = \sum_{i=1}^{n} \langle x, v_i\rangle \, v_i. \end{equation*}

Corollary 1. Let VV be a finite-dimensional inner product space with an orthonormal basis β={v1,v2,,vn}\beta = \{v_1, v_2, \ldots, v_n\}. Let TT be a linear operator on VV, and let A=[T]βA = [T]_{\beta}. Then for any ii and jj,

Aij=T(vj),vi.\begin{equation*} A_{ij} = \langle T(v_j), v_i \rangle . \end{equation*}

Theorem 6.5. Let WW be a finite-dimensional subspace of an inner product space VV, and let yVy \in V. Then there exist unique vectors uWu \in W and zWz \in W^{\perp} such that y=u+zy = u + z. Furthermore, if {v1,v2,,vk}\{v_1, v_2, \ldots, v_k\} is an orthonormal basis for WW, then

u=i=1ky,vivi.\begin{equation*} u = \sum_{i=1}^{k} \langle y, v_i\rangle \, v_i. \end{equation*}

Corollary 1. The vector uu is the unique vector in WW

that is “closest” to yy; that is, for any xWx \in W,

yxyu,\begin{equation*} \|y - x\| \ge \|y - u\|, \end{equation*}

and this inequality is an equality if and only if x=ux = u.

  • The vector uu in the corollary is called the orthogonal projection of yy on WW.

Theorem 6.6. Suppose that S={v1,v2,,vk}S = \{v_1, v_2, \ldots, v_k\} is an orthonormal set in an nn-dimensional inner product space VV. Then

(a)S\text{(a)}\quad S can be extended to an orthonormal basis {v1,v2,,vk,vk+1,,vn}\{v_1, v_2, \ldots, v_k, v_{k+1}, \ldots, v_n\} for VV.

(b)\text{(b)}\quadIf W=span(S)W = \operatorname{span}(S), then S1={vk+1,vk+2,,vn}S_1 = \{v_{k+1}, v_{k+2}, \ldots, v_n\} is an orthonormal basis for WW^{\perp} .

(c)\text{(c)}\quadIf WW is any subspace of VV, then

dim(V)=dim(W)+dim(W).\begin{equation*} \dim(V) = \dim(W) + \dim(W^{\perp}). \end{equation*}

6.3 The Adjoint of a Linear Operator

Theorem 6.7. Let VV be a finite-dimensional inner product space over F\mathbb{F}, and let g:VFg : V \to \mathbb{F} be a linear transformation. Then there exists a unique vector yVy \in V such that

g(x)=x,yfor all xV.\begin{equation*} g(x) = \langle x, y \rangle \quad \text{for all } x \in V. \end{equation*}

Adjoint

Let VV be a finite-dimensional inner product space, and let TT be a linear operator on VV. Then there exists a unique function T:VVT^* : V \to V such that

T(x),y=x,T(y)for all x,yV.\begin{equation*} \langle T(x), y \rangle = \langle x, T^*(y) \rangle \quad \text{for all } x, y \in V. \end{equation*}

Furthermore, TT^* is linear. The linear operator TT^* is called the adjoint of the operator TT.

Theorem 6.8. Let VV be a finite-dimensional inner product space, and let β\beta be an orthonormal basis for VV. If TT is a linear operator on VV, then

[T]β=[T]β.\begin{equation*} [T^*]_{\beta} = [T]_{\beta}^*. \end{equation*}

Theorem 6.9. Let VV be an inner product space, and let TT and UU be linear operators on VV. Then:

(a)(T+U)=T+U\text{(a)}\quad (T + U)^* = T^* + U^*;

(b)(cT)=cT\text{(b)}\quad (cT)^* = \overline{c}\, T^* for any cFc \in \mathbb{F};

(c)(TU)=UT\text{(c)}\quad(TU)^* = U^* T^*;

(d)T=T\text{(d)}\quad T^{**} = T;

(e)I=I\text{(e)}\quad I^* = I.

Corollary 1. Let AA and BB be n×nn \times n matrices. Then:

(a)(A+B)=A+B\text{(a)}\quad(A + B)^* = A^* + B^*;

(b)(cA)=cA\text{(b)}\quad(cA)^* = \overline{c}\, A^* for all cFc \in \mathbb{F};

(c)(AB)=BA\text{(c)}\quad(AB)^* = B^* A^*;

(d)A=A\text{(d)}\quad A^{**} = A;

(e)I=I\text{(e)}\quad I^* = I.

Least Squares Problem

The problem reduces to finding a vector x0Fnx_0 \in \mathbb{F}^n such that

yAx0yAxfor all xFn.\begin{equation*} \|y - A x_0\| \le \|y - A x\| \quad \text{for all } x \in \mathbb{F}^n. \end{equation*}

That is, Ax0Ax_0 is the orthogonal projection of yy onto the column space of AA.

Lemma 1. Let AMm×n(F)A \in M_{m \times n}(F), xFnx \in F^n, and yFmy \in F^m. Then

Ax,ym=x,Ayn.\begin{equation*} \langle Ax, y \rangle_m = \langle x, A^* y \rangle_n. \end{equation*}

Lemma 2. Let AMm×n(F)A \in M_{m \times n}(F). Then

rank(AA)=rank(A).\begin{equation*} \operatorname{rank}(A^*A) = \operatorname{rank}(A). \end{equation*}

Theorem. Let AMm×n(F)A \in M_{m \times n}(F) and yFmy \in F^m. Then there exists x0Fnx_0 \in F^n such that

(AA)x0=Ay\begin{equation*} (A^*A)x_0 = A^*y \end{equation*}

and

Ax0yAxyfor all xFn.\begin{equation*} \|Ax_0 - y\| \le \|Ax - y\| \quad \text{for all } x \in F^n. \end{equation*}

Furthermore, if rank(A)=n\operatorname{rank}(A) = n, then

x0=(AA)1Ay.\begin{equation*} x_0 = (A^*A)^{-1}A^*y. \end{equation*}

Minimal Solutions to Systems of Linear Equations

Even when a system of linear equations Ax=bAx = b is consistent, there may be no unique solution. In such cases, it may be desirable to find a solution of minimal norm.

A solution ss to Ax=bAx = b is called a minimal solution if

su\begin{equation*} \|s\| \le \|u\| \end{equation*}

for all other solutions uu.

Theorem. Let AMm×n(F)A \in M_{m \times n}(F) and bFmb \in F^m. Suppose that Ax=bAx = b is consistent. Then the following statements are true.

(a)\text{(a)}\quadThere exists exactly one minimal solution ss of Ax=bAx = b, and

sR(LA).\begin{equation*} s \in R(L_{A^*}). \end{equation*}

(b)\text{(b)}\quadThe vector ss is the only solution to Ax=bAx = b that lies in R(LA)R(L_{A^*}); that is, if uu satisfies

(AA)u=b,\begin{equation*} (AA^*)u = b, \end{equation*}

then

s=Au.\begin{equation*} s = A^*u. \end{equation*}

6.4 Normal and Self-Adjoint Operators

Lemma. Let TT be a linear operator on a finite-dimensional inner product space VV. If TT has an eigenvector, then so does TT^*.

Theorem 6.10 (Schur). Let TT be a linear operator on a finite-dimensional inner product space VV.Suppose that the characteristic polynomial of TT splits. Then there exists an orthonormal basis β\beta for VV such that the matrix [T]β[T]_\beta is upper triangular.

Normal

Let VV be an inner product space, and let TT be a linear operator on VV. We say that TT is normal if TT=TTTT^* = T^*T. An n×nn \times n real or complex matrix AA is normal if AA=AAAA^* = A^*A.

Theorem 6.11. Let VV be an inner product space, and let TT be a normal operator on VV. Then the following statements are true.

(a)T(x)=T(x)\text{(a)}\quad \|T(x)\| = \|T^*(x)\| for all xVx \in V.

(b)TcI\text{(b)}\quad T - cI is normal for every cFc \in F.

(c)\text{(c)}\quadIf xx is an eigenvector of TT, then xx is also an eigenvector of TT^*. In fact, if T(x)=λxT(x) = \lambda x, then T(x)=λxT^*(x) = \overline{\lambda} x.

(d)\text{(d)}\quadIf λ1\lambda_1 and λ2\lambda_2 are distinct eigenvalues of TT with corresponding eigenvectors x1x_1 and x2x_2, then x1x_1 and x2x_2 are orthogonal.

Theorem 6.12. Let TT be a linear operator on a finite-dimensional complex inner product space VV. Then TT is normal if and only if there exists an orthonormal basis for VV consisting of eigenvectors of TT.

Self-Adjoint

Let TT be a linear operator on an inner product space VV.We say that TT is self-adjoint (Hermitian) if T=TT = T^*. An n×nn \times n real or complex matrix AA is self-adjoint (Hermitian) if A=AA = A^*.

Theorem 6.13. Let TT be a self-adjoint operator on a finite-dimensional inner product space VV. Then

(a)\text{(a)}\quadEvery eigenvalue of TT is real.

(b)\text{(b)}\quadSuppose that VV is a real inner product space. Then the characteristic polynomial of TT splits.

Theorem 6.14. Let TT be a linear operator on a finite-dimensional real inner product space VV. Then TT is self-adjoint if and only if there exists an orthonormal basis β\beta for VV consisting of eigenvectors of TT.

6.5 Unitary and Orthogonal Operators and Their Matrices

Unitary and Orthogonal Operator

Let TT be a linear operator on a finite-dimensional inner product space VV over FF. If T(x)=x\|T(x)\| = \|x\| for all xVx \in V, we call TT a unitary operator if F=CF = \mathbb{C} and an orthogonal operator if F=RF = \mathbb{R}.

Lemma. Let UU be a self-adjoint operator on a finite-dimensional inner product space VV. If x,U(x)=0\langle x, U(x) \rangle = 0 for all xVx \in V, then U=T0U = T_0.

Theorem 6.15. Let TT be a linear operator on a finite-dimensional inner product space VV. Then the following statements are equivalent.

(a)TT=TT=I\text{(a)}\quad TT^* = T^*T = I.

(b)T(x),T(y)=x,y\text{(b)}\quad\langle T(x), T(y) \rangle = \langle x, y \rangle for all x,yVx, y \in V.

(c)\text{(c)}\quadIf β\beta is an orthonormal basis for VV, then T(β)T(\beta) is an orthonormal basis for VV.

(d)\text{(d)}\quadThere exists an orthonormal basis β\beta for VV such that T(β)T(\beta) is an orthonormal basis for VV.

(e)T(x)=x\text{(e)}\quad\|T(x)\| = \|x\| for all xVx \in V.

Theorem 6.16. Let TT be a linear operator on a finite-dimensional real inner product space VV. Then VV has an orthonormal basis of eigenvectors of TT with corresponding eigenvalues of absolute value 11 if and only if TT is both self-adjoint and orthogonal.

Theorem 6.17. Let TT be a linear operator on a finite-dimensional complex inner product space VV. Then VV has an orthonormal basis of eigenvectors of TT with corresponding eigenvalues of absolute value 11 if and only if TT is unitary.

Definition. Let LL be a one-dimensional subspace of R2\mathbb{R}^2. We may view LL as a line in the plane through the origin. A linear operator TT on R2\mathbb{R}^2 is called a reflection of R2\mathbb{R}^2 about LL if

T(x)=xfor all xL\begin{equation*} T(x)=x \quad \text{for all } x\in L \end{equation*}

and

T(x)=xfor all xL.\begin{equation*} T(x)=-x \quad \text{for all } x\in L^\perp. \end{equation*}

Definition. A square matrix AA is called an orthogonal matrix if

AtA=AAt=I,\begin{equation*} A^t A = A A^t = I, \end{equation*}

and unitary if

AA=AA=I.\begin{equation*} A^* A = A A^* = I. \end{equation*}

Definition. AA and BB are unitarily equivalent [orthogonally equivalent] if and only if there exists a unitary [orthogonal] matrix PP such that A=PBPA = P^*BP.

Theorem 6.18. Let AA be a complex n×nn \times n matrix. Then AA is normal if and only if AA is unitarily equivalent to a diagonal matrix.

Theorem 6.19. Let AA be a real n×nn \times n matrix. Then AA is symmetric (self-adjoint) if and only if AA is orthogonally equivalent to a real diagonal matrix.

6.6 Orthogonal Projections and the Spectral Theorem

Orthogonal Projection

If V=W1W2V = W_1 \oplus W_2, then a linear operator TT on VV is the projection on W1W_1 along W2W_2 if, whenever

x=x1+x2,\begin{equation*} x = x_1 + x_2, \end{equation*}

with x1W1x_1 \in W_1 and x2W2x_2 \in W_2, we have

T(x)=x1.\begin{equation*} T(x) = x_1. \end{equation*}

Let VV be an inner product space, and let T:VVT: V \to V be a projection. We say that TT is an orthogonal projection if

R(T)=N(T)\begin{equation*} R(T)^\perp = N(T) \end{equation*}

and

N(T)=R(T).\begin{equation*} N(T)^\perp = R(T). \end{equation*}

Theorem 6.20. Let VV be an inner product space, and let TT be a linear operator on VV. Then TT is an orthogonal projection if and only if TT has an adjoint TT^* and

T2=T=T.\begin{equation*} T^2 = T = T^*. \end{equation*}

Theorem 6.21 (The Spectral Theorem). Suppose that TT is a linear operator on a finite-dimensional inner product space VV over F\mathbb{F} with the distinct eigenvalues λ1,λ2,,λk\lambda_1, \lambda_2, \ldots, \lambda_k. Assume that TT is normal if F=C\mathbb{F} = \mathbb{C} and that TT is self-adjoint if F=R\mathbb{F} = \mathbb{R}. For each ii (1ik)(1 \le i \le k), let WiW_i be the eigenspace of TT corresponding to the eigenvalue λi\lambda_i, and let TiT_i be the orthogonal projection of VV on WiW_i. Then the following statements are true.

(a)\text{(a)}\quad

V=W1W2Wk.\begin{equation*} V = W_1 \oplus W_2 \oplus \cdots \oplus W_k. \end{equation*}

(b)\text{(b)}\quadIf WiW_i' denotes the direct sum of the subspaces WjW_j for jij \ne i, then

Wi=Wi.\begin{equation*} W_i^\perp = W_i'. \end{equation*}

(c)\text{(c)}\quad

TiTj=δijTifor 1i,jk.\begin{equation*} T_i T_j = \delta_{ij} T_i \quad \text{for } 1 \le i, j \le k. \end{equation*}

(d)\text{(d)}\quad

I=T1+T2++Tk.\begin{equation*} I = T_1 + T_2 + \cdots + T_k. \end{equation*}

(e)\text{(e)}\quad

T=λ1T1+λ2T2++λkTk.\begin{equation*} T = \lambda_1 T_1 + \lambda_2 T_2 + \cdots + \lambda_k T_k. \end{equation*}

Corollary 1. If F=C\mathbb{F} = \mathbb{C}, then TT is normal if and only if

T=g(T)\begin{equation*} T^* = g(T) \end{equation*}

for some polynomial gg.

Corollary 2. If F=C\mathbb{F} = \mathbb{C}, then TT is unitary if and only if TT is normal and

λ=1\begin{equation*} |\lambda| = 1 \end{equation*}

for every eigenvalue λ\lambda of TT.

Corollary 3. If F=C\mathbb{F} = \mathbb{C} and TT is normal, then TT is self-adjoint if and only if every eigenvalue of TT is real.

Corollary 4. Let TT be as in the spectral theorem with spectral decomposition

T=λ1T1+λ2T2++λkTk.\begin{equation*} T = \lambda_1 T_1 + \lambda_2 T_2 + \cdots + \lambda_k T_k. \end{equation*}

Then each TjT_j is a polynomial in TT.

6.7 The Singular Value Decomposition

Singular Value Decomposition

Theorem 6.22 (Singular Value Theorem for Linear Transformations). Let VV and WW be finite-dimensional inner product spaces, and let T:VWT : V \to W be a linear transformation of rank rr. Then there exist orthonormal bases {v1,v2,,vn}\{v_1, v_2, \dots, v_n\} for VV and {u1,u2,,ur}\{u_1, u_2, \dots, u_r\} for WW and positive scalars

σ1σ2σr\begin{equation*} \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r \end{equation*}

such that

T(vi)={σiui,1ir,0,i>r.\begin{equation*} T(v_i) = \begin{cases} \sigma_i u_i, & 1 \le i \le r, \\ 0, & i > r. \end{cases} \end{equation*}

Thus

T(ui)=j=1nT(ui),vjvj={σivi,i=jr,0,otherwise.\begin{equation*} T^*(u_i) = \sum_{j=1}^{n} \langle T^*(u_i), v_j \rangle v_j = \begin{cases} \sigma_i v_i, & i = j \le r, \\ 0, & \text{otherwise}. \end{cases} \end{equation*}
  • Conversely, suppose that the preceding conditions are satisfied. Then for 1in1 \le i \le n, viv_i is an eigenvector of TTT^*T with corresponding eigenvalue σi2\sigma_i^2 if 1ir1 \le i \le r, and 00 if i>ri > r. Therefore the scalars σ1,σ2,,σr\sigma_1, \sigma_2, \dots, \sigma_r are uniquely determined by TT.

  • Each viv_i is an eigenvector of TTT^*T with corresponding eigenvalue σi2\sigma_i^2 if iri \le r, and 00 if i>ri > r.

Definition. The unique scalars σ1,σ2,,σr\sigma_1, \sigma_2, \dots, \sigma_r are called the singular values of TT. If rr is less than both mm and nn, then the term singular value is extended to include σr+1==σk=0\sigma_{r+1} = \cdots = \sigma_k = 0, where kk is the minimum of mm and nn.

Definition. Let AA be an m×nm \times n matrix. We define the singular values of AA to be the singular values of the linear transformation LAL_A.

Theorem 6.23 (Singular Value Decomposition Theorem for Matrices). Let AA be an m×nm \times n matrix of rank rr with the positive singular values

σ1σ2σr,\begin{equation*} \sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r, \end{equation*}

and let Σ\Sigma be the m×nm \times n matrix defined by

Σij={σi,i=jr,0,otherwise.\begin{equation*} \Sigma_{ij} = \begin{cases} \sigma_i, & i = j \le r, \\ 0, & \text{otherwise}. \end{cases} \end{equation*}

Then there exists an m×mm \times m unitary matrix UU and an n×nn \times n unitary matrix VV such that

A=UΣV.\begin{equation*} A = U \Sigma V^* . \end{equation*}

Definition. Let AA be an m×nm \times n matrix of rank rr with positive singula values σ1σ2σr\sigma_1 \ge \sigma_2 \ge \cdots \ge \sigma_r. A factorization A=UΣVA = U \Sigma V^* where UU and VV are unitary matrices and Σ\Sigma is the m×nm \times n matrix is called a singular value decomposition of AA.

6.8 Bilinear and Quadratic Forms

Bilinear Form

Let VV be a vector space over a field FF. A function HH from the set V×VV \times V of ordered pairs of vectors to FF is called a bilinear form on VV if HH is linear in each variable when the other variable is held fixed; that is, HH is a bilinear form on VV if

(a)H(ax1+x2,y)=aH(x1,y)+H(x2,y)\text{(a)}\quad H(ax_1 + x_2, y) = aH(x_1, y) + H(x_2, y) for all x1,x2,yVx_1, x_2, y \in V and aFa \in F,

(b)H(x,ay1+y2)=aH(x,y1)+H(x,y2)\text{(b)}\quad H(x, ay_1 + y_2) = aH(x, y_1) + H(x, y_2) for all x,y1,y2Vx, y_1, y_2 \in V and aFa \in F.

Definition. Let VV be a vector space, let H1H_1 and H2H_2 be bilinear forms on VV, and let aa be a scalar. We define the sum H1+H2H_1 + H_2 and the scalar product aH1aH_1 by the equations

(H1+H2)(x,y)=H1(x,y)+H2(x,y)\begin{equation*} (H_1 + H_2)(x,y) = H_1(x,y) + H_2(x,y) \end{equation*}

and

(aH1)(x,y)=a(H1(x,y)),for all x,yV.\begin{equation*} (aH_1)(x,y) = a(H_1(x,y)), \quad \text{for all } x,y \in V. \end{equation*}
  • For any vector space VV, the sum of two bilinear forms and the product of a scalar and a bilinear form on VV are again bilinear forms on VV. Furthermore, B(V)\mathcal{B}(V) is a vector space with respect to these operations.

Definition. Let β={v1,v2,,vn}\beta = \{v_1, v_2, \dots, v_n\} be an ordered basis for an nn-dimensional vector space VV, and let HB(V)H \in \mathcal{B}(V). We can associate with HH an n×nn \times n matrix AA whose entry in row ii and column jj is defined by

Aij=H(vi,vj),for i,j=1,2,,n.\begin{equation*} A_{ij} = H(v_i, v_j), \quad \text{for } i,j = 1,2,\dots,n. \end{equation*}

The matrix AA above is called the matrix representation of HH with respect to the ordered basis β\beta and is denoted by ψβ(H)\psi_\beta(H).

Theorem 6.24. Let FF be a field, nn a positive integer, and β\beta be the standard ordered basis for FnF^n. Then for any HB(Fn)H \in \mathcal{B}(F^n), there exists a unique matrix AMn×n(F)A \in M_{n \times n}(F), namely A=ψβ(H)A = \psi_\beta(H), such that

H(x,y)=xtAyfor all x,yFn.\begin{equation*} H(x,y) = x^t A y \quad \text{for all } x,y \in F^n. \end{equation*}

Definition. Let A,BMn×n(F)A, B \in M_{n \times n}(F). Then BB is said to be congruent to AA if there exists an invertible matrix QMn×n(F)Q \in M_{n \times n}(F) such that

B=QtAQ.\begin{equation*} B = Q^t A Q. \end{equation*}

Theorem 6.25. Let VV be a finite-dimensional vector space with ordered bases β={v1,v2,,vn}\beta = \{v_1, v_2, \dots, v_n\} and γ={w1,w2,,wn}\gamma = \{w_1, w_2, \dots, w_n\}, and let QQ be the change-of-coordinate matrix changing γ\gamma-coordinates into β\beta-coordinates. Then, for any HB(V)H \in \mathcal{B}(V), we have

ψγ(H)=Qtψβ(H)Q.\begin{equation*} \psi_\gamma(H) = Q^t \psi_\beta(H) Q. \end{equation*}

Therefore ψγ(H)\psi_\gamma(H) is congruent to ψβ(H)\psi_\beta(H).

Symmetric Bilinear Form

A bilinear form HH on a vector space VV is symmetric if H(x,y)=H(y,x)H(x,y) = H(y,x) for all x,yVx,y \in V. As the name suggests, symmetric bilinear forms correspond to symmetric matrices.

**Theorem 6.26. ** Let HH be a bilinear form on a finite-dimensional vector space VV, and let β\beta be an ordered basis for VV. Then HH is symmetric if and only if ψβ(H)\psi_\beta(H) is symmetric.

Definition. A bilinear form HH on a finite-dimensional vector space VV is called diagonalizable if there is an ordered basis β\beta for VV such that ψβ(H)\psi_\beta(H) is a diagonal matrix.

Lemma. Let HH be a nonzero symmetric bilinear form on a vector space VV over a field FF not of characteristic two. Then there is a vector xx in VV such that H(x,x)0H(x,x) \neq 0.

Theorem 6.27. Let VV be a finite-dimensional vector space over a field FF not of characteristic two. Then every symmetric bilinear form on VV is diagonalizable.

Quadratic Form

Let VV be a vector space over FF. A function K:VFK : V \to F is called a quadratic form if there exists a symmetric bilinear form HB(V)H \in \mathcal{B}(V) such that

K(x)=H(x,x)for all xV.\begin{equation*} K(x) = H(x,x) \quad \text{for all } x \in V. \end{equation*}

Definition. Given the variables t1,t2,,tnt_1, t_2, \dots, t_n that take values in a field FF not of characteristic two and given (not necessarily distinct) scalars aija_{ij} (1ijn)(1 \le i \le j \le n), define the polynomial

f(t1,t2,,tn)=ijaijtitj.\begin{equation*} f(t_1, t_2, \dots, t_n) = \sum_{i \le j} a_{ij} t_i t_j . \end{equation*}

Any such polynomial is a quadratic form. In fact, if β\beta is the standard ordered basis for FnF^n, then the symmetric bilinear form HH corresponding to the quadratic form ff has the matrix representation ψβ(H)=A\psi_\beta(H) = A, where

Aij=Aji={aii,if i=j,12aij,if ij.\begin{equation*} A_{ij} = A_{ji} = \begin{cases} a_{ii}, & \text{if } i = j, \\ \frac{1}{2} a_{ij}, & \text{if } i \ne j . \end{cases} \end{equation*}

Theorem 6.28. Let KK be a quadratic form on a finite-dimensional real inner product space VV. There exists an orthonormal basis β={v1,v2,,vn}\beta = \{v_1, v_2, \dots, v_n\} for VV and scalars λ1,λ2,,λn\lambda_1, \lambda_2, \dots, \lambda_n (not necessarily distinct) such that if xVx \in V and

x=i=1nsivi,siR,\begin{equation*} x = \sum_{i=1}^{n} s_i v_i, \quad s_i \in \mathbb{R}, \end{equation*}

then

K(x)=i=1nλisi2.\begin{equation*} K(x) = \sum_{i=1}^{n} \lambda_i s_i^2. \end{equation*}