Some Facts about Eigenvalues

1. Localizing Eigenvalues

  • It is useful when we want to know the rough location of eigenvalues of a matrix.
  • Gershgorin’s Theorem says that all the eigenvalues of an n×nn \times n matrix AA is within nn disks. The kk-th disk of these nn disks is centered at akka_{kk} and its radius is jkakj\sum_{j \not = k} \vert a_{kj} \vert where aija_{ij} is (i,j)(i, j) element of AA.
  • Suppose that xx is the eigenvector of AA and λ\lambda is the eigenvalue of xx such that x=1\left\| x \right\|_{\infty} = 1. Then there exists xk=1x_k = 1 in xx.
Ax=λx=(ak1akkakn)(x1xkxn)=λ(x1xkxn)    ak1x1++akkxk++aknxn=λxk    (λakk)xk=λakk=jkakjxj    λakk=jkakjxjjkakjxjjkakj=radius\begin{aligned} Ax = \lambda x &= \left(\begin{array}{ccccc} & & \vdots & & \\ a_{k1} & \cdots & a_{kk} & \cdots & a_{kn} \\ & & \vdots & & \end{array}\right) \left(\begin{array}{c} x_1 \\ \vdots \\ x_k \\ \vdots \\ x_n \end{array}\right) = \lambda \left(\begin{array}{c} x_1 \\ \vdots \\ x_k \\ \vdots \\ x_n \end{array}\right) \\ &\implies a_{k1}x_1 + \cdots + a_{kk}x_k + \cdots + a_{kn}x_n = \lambda x_k \\ &\implies (\lambda - a_{kk})x_k = \lambda - a_{kk} = \sum_{j \not = k} a_{kj}x_j \\ &\implies \vert \lambda - a_{kk} \vert = \left\vert \sum_{j \not = k} a_{kj} x_j \right\vert \le \sum_{j \not = k}\vert a_{kj} \vert \vert x_j \vert \le \sum_{j \not = k}\vert a_{kj} \vert = \text{radius} \end{aligned}
  • For example, the eigenvalues of A1A_1 and A2A_2 are as follows:
A1=(4.00.50.00.65.00.60.00.53.0),A2=(4.00.50.00.65.00.60.00.53.0)\begin{aligned} A_1 = \left(\begin{array}{ccc} \color{blue}4.0 & -0.5 & 0.0 \\ 0.6 & \color{blue}5.0 & -0.6 \\ 0.0 & 0.5 & \color{blue}3.0 \end{array}\right), \quad A_2 = \left(\begin{array}{ccc} \color{red}4.0 & 0.5 & 0.0 \\ 0.6 & \color{red}5.0 & 0.6 \\ 0.0 & 0.5 & \color{red}3.0 \end{array}\right) \end{aligned}

Gershgorin

A1A_1 and A2A_2 have the same diagonal elements and sum of absolute values in each row except the diagonal element, so their disks are the same. As above, the blue circles are the eigenvalues of A1A_1 and the red circles are those of A2A_2.

2. Sensitivity

  • It is about the sensitivity of the eigenvectors and eigenvalues to small changes in an n×nn \times n matrix AA.
  • Suppose that an n×nn \times n matrix X=(x(1),,x(n))X = (x_{(1)}, \cdots, x_{(n)}) where x(i)x_{(i)} is the eigenvector of AA and n×nn \times n diagonal matrix D=diag(λ1,,λn)D = diag(\lambda_1, \cdots, \lambda_n) where λi\lambda_i is the eigenvalue corresponding to x(i)x_{(i)}. Then AX=XDAX = XD.
  • When all the eigenvectors of AA are nondefective, which means they are linearly independent, X1AX=DX^{-1}AX = D. After AA goes through some small changes, assume that AA turns into A+EA+E where EE denotes errors. Then X1(A+I)X=X1AX+X1EX=D+FX^{-1}(A+I)X = X^{-1}AX + X^{-1}EX = D+F. Since A+EA+E and D+FD+F are similar, they must have the same eigenvalues. Let μ\mu be an eigenvalue of A+EA+E, then for the eigenvector vv corresponding to μ\mu,
(D+F)v=μv    v=(μID)1Fv\begin{equation} (D+F)v = \mu v \implies v = (\mu I - D)^{-1}Fv \end{equation}

For this to be true, (μID)(\mu I - D) should be nonsingular which happens when all its diagonal elements are not zero. It means that any elements of DD should not have μ\mu. Otherwise, DD has the eigenvalue μ\mu, so F=0F = 0 and E=0E = 0 which mean there are no errors. However, we consider E0E \not = 0, so (μID)(\mu I - D) is nonsingular. Accordingly, v2(μID)12F2v2    (μID)12F2\begin{equation} \left\|v\right\|_2 \le \left\|(\mu I - D)^{-1}\right\|_2 \left\|F\right\|_2 \left\|v\right\|_2 \implies \left\|(\mu I - D)^{-1}\right\|_2 \le \left\|F\right\|_2 \end{equation}

By definition, (μID)12\left\|(\mu I - D)^{-1}\right\|_2 is the largest singular value. Therefore (μID)12=1μλk\left\|(\mu I - D)^{-1}\right\|_2 = \frac{1}{\mu - \lambda_k} where λk\lambda_k is the eigenvalue of DD closest to μ\mu. (μID)12=μλkF2=X1EX2X12E2X2=condition number(X)E2\begin{aligned} \left\|(\mu I - D)^{-1}\right\|_2 &= \vert \mu - \lambda_k \vert \le \left\|F\right\|_2 = \left\|X^{-1}EX\right\|_2 \\ &\le \left\|X^{-1}\right\|_2 \left\|E\right\|_2 \left\|X\right\|_2 = \text{condition number}(X)\left\|E\right\|_2 \end{aligned}

So, the effect from a small change of AA depends on the condition number of XX, not the condition number of AA.

  • Another way to check the sensitivity of eigenvalues and eigenvectors including even when AA is defective is using right and left eigenvectors together. Let xx and yy be right and left eigenvectors, then there exist λ\lambda and μ\mu such that Ax=λxAx = \lambda x and ytA=μyty^t A = \mu y^t. It yields that ytAx=λytx=μytxy^t Ax = \lambda y^t x = \mu y^t x, so λ=μ\lambda = \mu or ytx=0y^t x = 0. Assume that ytx0y^t x \not = 0. If AA has been changed by an error EE and xx and λ\lambda are also changed, then
(A+E)(x+Δx)=(λ+Δλ)(x+Δx)Ax+Ex+AΔx+EΔx=λx+Δλx+λΔx+ΔλΔxEx+AΔx+EΔx=Δλx+λΔx+ΔλΔxEx+AΔxΔλx+λΔx\begin{aligned} (A+E)(x + \Delta x) &= (\lambda + \Delta \lambda)(x + \Delta x) \\ Ax + Ex + A \Delta x + E \Delta x &= \lambda x + \Delta \lambda x + \lambda \Delta x + \Delta \lambda \Delta x \\ Ex + A \Delta x + E \Delta x &= \Delta \lambda x + \lambda \Delta x + \Delta \lambda \Delta x \\ Ex + A \Delta x &\approx \Delta \lambda x + \lambda \Delta x \end{aligned}

because EΔxE \Delta x and ΔλΔx\Delta \lambda \Delta x are small enough and negligible. Since λ=μ\lambda = \mu, we can add yy as follows: ytEx+ytAΔxΔλytx+λytΔxytExΔλytxΔλytExytxΔλy2x2ytx2E2=1cosθE2\begin{aligned} y^t Ex + y^t A \Delta x &\approx \Delta \lambda y^t x + \lambda y^t \Delta x \\ y^t Ex &\approx \Delta \lambda y^t x \\ \Delta \lambda &\approx \frac{y^t Ex}{y^t x} \\ \vert \Delta \lambda \vert &\lessapprox \frac{\left\| y \right\|_2 \left\| x \right\|_2}{\left\| y^t x \right\|_2} \left\| E \right\|_2 = \frac{1}{cos\theta} \left\| E \right\|_2 \end{aligned}

where θ\theta is the angle between xx and yy. So, it is sensitive as θ\theta increases.

  • For example,
A=(1495015453718054627925)eigenvalues: λ1=1,λ2=2,λ3=3normalized right eigenvectors: X=(x(1)x(2)x(3))=(0.3160.4040.1390.9490.9090.9740.0000.1010.179)normalized left eigenvectors: Y=(y(1)y(2)y(3))=(0.6810.6760.6880.2250.2250.2290.6970.7010.688)\begin{gather} A = \left(\begin{array}{ccc} -149 & -50 & -154 \\ 537 & 180 & 546 \\ -27 & -9 & -25 \end{array}\right) \\ \\ \text{eigenvalues: } \lambda_1 = 1, \lambda_2 = 2, \lambda_3 = 3 \\ \\ \text{normalized right eigenvectors: } X = \left(\begin{array}{ccc} x_{(1)} & x_{(2)} & x_{(3)} \end{array}\right) = \left(\begin{array}{ccc} 0.316 & 0.404 & 0.139 \\ -0.949 & -0.909 & -0.974 \\ 0.000 & -0.101 & 0.179 \end{array}\right) \\ \\ \text{normalized left eigenvectors: } Y = \left(\begin{array}{ccc} y_{(1)} & y_{(2)} & y_{(3)} \end{array}\right) = \left(\begin{array}{ccc} 0.681 & -0.676 & -0.688 \\ 0.225 & -0.225 & -0.229 \\ 0.697 & -0.701 & -0.688 \end{array}\right) \end{gather}

First, the condition number of XX is 12891289, so the eigenvalues of AA are sensitive. Second, y(1)tx(1)=0.0017y^t_{(1)} x_{(1)} = 0.0017, y(2)tx(2)=0.0025y^t_{(2)} x_{(2)} = 0.0025, and y(3)tx(3)=0.0046y^t_{(3)} x_{(3)} = 0.0046, so the angles between x(i)x_{(i)} and y(i)y_{(i)} are large. Therefore, it is expected that AA is sensitive to small changes. The following are eigenvalues changed from tiny changes of AA. A+E=(14950154537180.0154627925)    {λ1=0.207λ2=2.301λ3=3.502A+E=(14950154537179.9954627925)    {λ1=1.664+1.054iλ2=1.6641.054iλ3=2.662\begin{aligned} A + E &= \left(\begin{array}{ccc} -149 & -50 & -154 \\ 537 & \color{red}180.01 & 546 \\ -27 & -9 & -25 \end{array}\right) \implies \begin{cases} \lambda_1 = 0.207 \\ \lambda_2 = 2.301 \\ \lambda_3 = 3.502 \end{cases} \\ \\ A + E &= \left(\begin{array}{ccc} -149 & -50 & -154 \\ 537 & \color{red}179.99 & 546 \\ -27 & -9 & -25 \end{array}\right) \implies \begin{cases} \lambda_1 = 1.664 + 1.054i \\ \lambda_2 = 1.664 - 1.054i \\ \lambda_3 = 2.662 \end{cases} \end{aligned}

It shows the large changes of eigenvalues about the small changes of AA.

3. Properties

  • Suppose that xx is the eigenvector of an n×nn \times n matrix AA and λ\lambda is the eigenvalue corresponding to xx.
  • AA has the eigenvalue which is zero

    \iff There exists x0x \not = 0 such that Ax=0Ax = 0

    \iff dimKerA>0\dim Ker A > 0

    \iff AA is singular

  • For α0R\alpha \not = 0 \in \R, αx\alpha x is also the eigenvector of AA.
  • xx is the eigenvector of αA\alpha A for αR\alpha \in \R, and its eigenvalue is αλ\alpha \lambda.
  • xx is the eigenvector of A+αIA + \alpha I for αR\alpha \in \R, and its eigenvalue is λ+α\lambda + \alpha.
  • For kNk \in \N, xx is the eigenvector of AkA^k, and its eigenvalue is λk\lambda^k.
  • If AA is invertible, xx is the eigenvector of A1A^{-1}, and its eigenvalue is 1λ\frac{1}{\lambda}.
  • For a diagonal matrix D=diag(a1,,an)D = diag(a_1, \cdots, a_n), its eigenvalues are a1,,ana_1, \cdots, a_n and its eigenvectors are e1,,ene_1, \cdots, e_n where all components of a standard basis vector eiRne_i \in \R^n are 00 except the ii-th element which is 11.
  • For upper or lower triangular matrices, the eigenvalues are their diagonal elements.
  • For an n×nn \times n nonsingular matrix SS, S1xS^{-1}x is the eigenvector of S1ASS^{-1}AS and the eigenvalue corresponding to S1xS^{-1}x is λ\lambda. Eigenvalues are not changed by similar transformations.
  • detA=λ1λn\det A = \lambda_1 \cdots \lambda_n.
  • For knk \le n, if eigenvalues λ1,,λk\lambda_1, \cdots, \lambda_k are distint, the eigenvectors x(1),,x(n)x_{(1)}, \cdots, x_{(n)} corresponding to them are linearly independent.
  • If all the eigenvalues are distint, all the eigenvectors are linearly independent, so X=(x(1),,x(n))X = (x_{(1)}, \cdots, x_{(n)}) is nonsingular and diagonalizable as X1AX=diag(λ1,,λn)X^{-1}AX = diag(\lambda_1, \cdots, \lambda_n). However, although XX is diagonalizable, its all eigenvalues may not be distint such A=IA = I. Moreover, although the eigenvalues of AA are not unique, AA may be diagonalizable. For example,
A+E=(311021003)    {λ1=2λ2=3λ3=3    X=(110101001): nonsingular\begin{aligned} A + E &= \left(\begin{array}{ccc} 3 & -1 & 1 \\ 0 & 2 & 1 \\ 0 & 0 & 3 \end{array}\right) \implies \begin{cases} \lambda_1 = 2 \\ \lambda_2 = 3 \\ \lambda_3 = 3 \end{cases} \\ &\implies X = \left(\begin{array}{ccc} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \end{array}\right) \text{: nonsingular} \end{aligned}
  • Singular values of AA are the nonnegative square roots of eigenvalues of AtAA^t A.

References

[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.

[2] Hiraoka Kazuyuki, Hori Gen, Programming No Tame No Senkei Daisu, Ohmsha.


© 2022. All rights reserved.