Positive Definiteness

1. Definite Matrices

  • An N×NN \times N symmetric matrix AA is positive definite if and only if for all N×1N \times 1 vector x0x \not = 0, xtAx>0x^t Ax > 0.
  • An N×NN \times N symmetric matrix AA is positive semi-definite if and only if for all N×1N \times 1 vector x0x \not = 0, xtAx0x^t Ax \ge 0.
  • An N×NN \times N symmetric matrix AA is negative definite if and only if for all N×1N \times 1 vector x0x \not = 0, xtAx<0x^t Ax < 0.
  • An N×NN \times N symmetric matrix AA is negative semi-definite if and only if for all N×1N \times 1 vector x0x \not = 0, xtAx0x^t Ax \le 0.
  • A matrix that is not positive semi-definite and not negative semi-definite is indefinite.

2. What they look like

  • Suppose that AA is a symmetric 2×22 \times 2 matrix and x0x \not = 0 is a 2×12 \times 1 vector.
A=(abbc)    xtAx=(x1x2)(abbc)(x1x2)=ax12+2bx1x2+cx22\begin{equation} A = \left(\begin{array}{cc} a & b \\ b & c \end{array}\right) \implies x^t Ax = \left(\begin{array}{cc} x_1 & x_2 \end{array}\right) \left(\begin{array}{cc} a & b \\ b & c \end{array}\right) \left(\begin{array}{c} x_1 \\ x_2 \end{array}\right) = ax^2_1 + 2bx_1 x_2 + cx^2_2 \end{equation}

Definite

  • As an example of the positive definiteness, let a=c=1a = c = 1 and b=0b = 0. It means xtAx=x12+x22>0x^t Ax = x^2_1 + x^2_2 > 0. It looks like a convex function.
  • As an example of the positive semi-definiteness, let a=1a = 1 and b=c=0b = c = 0. It means xtAx=x120x^t Ax = x^2_1 \ge 0. It looks like a convex function, but there exists a point such that xtAx=0x^t Ax = 0 for x0x \not = 0.
  • As an example of the negative definiteness, let a=c=1a = c = -1 and b=0b = 0. It means xtAx=x12x22<0x^t Ax = -x^2_1 - x^2_2 < 0. It looks like a concave function.
  • As an example of the negative semi-definiteness, let a=1a = -1 and b=c=0b = c = 0. It means xtAx=x120x^t Ax = -x^2_1 \le 0. It looks like a concave function, but there exists a point such that xtAx=0x^t Ax = 0 for x0x \not = 0.
  • As an example of the indefiniteness, let a=1a = 1, b=0b = 0, and c=1c = -1. It means xtAx=x12x22x^t Ax = x^2_1 - x^2_2. It does not look like a shape of the positive and negative definiteness. It has rather a saddle point.

Indefinite

3. Application: Hessian matrix

  • Suppose that f:RnRf: \R^n \to \R is twice continuously differentiable function and xx is a critical point of ff. According to Taylor’s Theorem,
f(x+s)=f(x)+f(x)ts+12stHf(x+αs)s\begin{equation} f(x + s) = f(x) + \nabla f(x)^t s + \frac{1}{2}s^t H_f(x + \alpha s)s \end{equation}

where ss is an N×1N \times 1 vector, HfH_f is the Hessian matrix of ff and α(0,1)\alpha \in (0, 1). Since xx is a critical point, f(x)=0\nabla f(x) = 0. Meanwhile, if HfH_f is positive definite, this property should be also true near xx, so is Hf(x+αs)H_f(x + \alpha s). It means that ff is increasing near xx. Therefore, f(x+s)>f(x)f(x + s) > f(x) and f(x)f(x) is a local minimum.

  • For a critical point xx, if HfH_f is positive definite, then xx is a minimum of ff.
  • For a critical point xx, if HfH_f is negative definite, then xx is a maximum of ff.
  • For a critical point xx, if HfH_f is indefinite, then xx is a saddle point of ff.
  • For a critical point xx, if HfH_f is semi-definite, then it requires a higher order differentiation.

4. Application: Cholesky factorization

  • If AA is symmetric and positive (semi-)definite, the Cholesky factorization of AA is available, which is arranged so that LLt=ALL^t = A where LL is lower triangular.
  • If there exists LL such that LLt=ALL^t = A, then for x0x \not = 0, xtAx=xtLLtx=(Ltx)t(Ltx)=Ltx20x^t Ax = x^t LL^t x = (L^t x)^t(L^t x) = \left\| L^t x \right\|^2 \ge 0. Therefore, AA is positive (semi-)definite.
  • Here are some examples:
A1=(9665)=(3021)(3201)    xtA1x=(3x1+2x2)2+x22>0:positive definiteA2=(9664)=(3020)(3200)    xtA2x=(3x1+2x2)20:positive semi-definiteA3=(9663)    xtA3x=(3x1+2x2)2x22:indefiniteA4=(0004)=(0020)(0200)=(0002)(0002)    xtA4x=4x220:positive semi-definite\begin{aligned} A_1 &= \left(\begin{array}{cc} 9 & 6 \\ 6 & 5 \end{array}\right) = \left(\begin{array}{cc} 3 & 0 \\ 2 & 1 \end{array}\right) \left(\begin{array}{cc} 3 & 2 \\ 0 & 1 \end{array}\right) \\ &\implies x^t A_1 x = (3x_1 +2x_2)^2 + x^2_2 > 0: \text{positive definite} \\ \\ A_2 &= \left(\begin{array}{cc} 9 & 6 \\ 6 & 4 \end{array}\right) = \left(\begin{array}{cc} 3 & 0 \\ 2 & 0 \end{array}\right) \left(\begin{array}{cc} 3 & 2 \\ 0 & 0 \end{array}\right) \\ &\implies x^t A_2 x = (3x_1 + 2x_2)^2 \ge 0: \text{positive semi-definite} \\ \\ A_3 &= \left(\begin{array}{cc} 9 & 6 \\ 6 & 3 \end{array}\right) \implies x^t A_3 x = (3x_1 + 2x_2)^2 - x^2_2: \text{indefinite} \\ \\ A_4 &= \left(\begin{array}{cc} 0 & 0 \\ 0 & 4 \end{array}\right) = \left(\begin{array}{cc} 0 & 0 \\ 2 & 0 \end{array}\right) \left(\begin{array}{cc} 0 & 2 \\ 0 & 0 \end{array}\right) = \left(\begin{array}{cc} 0 & 0 \\ 0 & 2 \end{array}\right) \left(\begin{array}{cc} 0 & 0 \\ 0 & 2 \end{array}\right) \\ &\implies x^t A_4 x = 4x_2^2 \ge 0: \text{positive semi-definite} \end{aligned}
  • Note that A3A_3 cannot be factorized because it is indefinite. Moreover, the Cholesky factorization can have more than one solution such as A4A_4.

5. How to tell if the matrix is positive definite

  • All pivots are positive: After Gaussian elimination, pivots are the first non-zero element in each row of a matrix. If all pivots of a matrix are positive, then it is positive definite.
A=(1221)    (1203)\begin{aligned} A &= \left(\begin{array}{cc} 1 & 2 \\ 2 & 1 \end{array}\right) \implies \left(\begin{array}{cc} 1 & 2 \\ 0 & -3 \end{array}\right) \end{aligned}

After Gaussian elimination, AA has the negative pivot, so it is not positive definite.

  • All the determinants of upper-left submatrices are positive:
A=(210121012)    2=2,2112=3,210121012=4\begin{aligned} A &= \left(\begin{array}{ccc} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{array}\right) \implies \left\vert\begin{array}{c} 2 \end{array}\right\vert = 2, \left\vert\begin{array}{cc} 2 & -1 \\ -1 & 2 \end{array}\right\vert = 3, \left\vert\begin{array}{ccc} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{array}\right\vert = 4 \end{aligned}

So, AA is positive definite.

  • All the eigenvalues are positive:
A=(908652933)    λ1=14.48,λ2=5.90,λ3=1.57\begin{aligned} A &= \left(\begin{array}{ccc} 9 & 0 & -8 \\ 6 & -5 & -2 \\ -9 & 3 & 3 \end{array}\right) \implies \lambda_1 = 14.48, \lambda_2 = -5.90, \lambda_3 = -1.57 \end{aligned}

Since there are negative eigenvalues, AA is not positive definite.

  • By definition: For an N×NN \times N matrix MM, MtMM^t M is positive (semi-)definite.
xtMtMx=(Mx)t(Mx)=Mx20\begin{aligned} x^t M^t Mx = (Mx)^t (Mx) = \left\| Mx \right\|^2 \ge 0 \end{aligned}

If Mx0Mx \not = 0 for all x0x \not = 0, MtMM^t M is positive definite.

Reference

[1] Michael T. Heath, Scientific Computing: An Introductory Survey. 2nd Edition, McGraw-Hill Higher Education.


© 2024. All rights reserved.