Schur product theorem

In mathematics, particularly in linear algebra, the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix. The result is named after Issai Schur[1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik.[2][3])

We remark that the converse of the theorem holds in the following sense. If M {\displaystyle M} is a symmetric matrix and the Hadamard product M N {\displaystyle M\circ N} is positive definite for all positive definite matrices N {\displaystyle N} , then M {\displaystyle M} itself is positive definite.

Proof

Proof using the trace formula

For any matrices M {\displaystyle M} and N {\displaystyle N} , the Hadamard product M N {\displaystyle M\circ N} considered as a bilinear form acts on vectors a , b {\displaystyle a,b} as

a ( M N ) b = tr ( M T diag ( a ) N diag ( b ) ) {\displaystyle a^{*}(M\circ N)b=\operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)}

where tr {\displaystyle \operatorname {tr} } is the matrix trace and diag ( a ) {\displaystyle \operatorname {diag} (a)} is the diagonal matrix having as diagonal entries the elements of a {\displaystyle a} .

Suppose M {\displaystyle M} and N {\displaystyle N} are positive definite, and so Hermitian. We can consider their square-roots M 1 2 {\displaystyle M^{\frac {1}{2}}} and N 1 2 {\displaystyle N^{\frac {1}{2}}} , which are also Hermitian, and write

tr ( M T diag ( a ) N diag ( b ) ) = tr ( M ¯ 1 2 M ¯ 1 2 diag ( a ) N 1 2 N 1 2 diag ( b ) ) = tr ( M ¯ 1 2 diag ( a ) N 1 2 N 1 2 diag ( b ) M ¯ 1 2 ) {\displaystyle \operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}{\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b){\overline {M}}^{\frac {1}{2}}\right)}

Then, for a = b {\displaystyle a=b} , this is written as tr ( A A ) {\displaystyle \operatorname {tr} \left(A^{*}A\right)} for A = N 1 2 diag ( a ) M ¯ 1 2 {\displaystyle A=N^{\frac {1}{2}}\operatorname {diag} (a){\overline {M}}^{\frac {1}{2}}} and thus is strictly positive for A 0 {\displaystyle A\neq 0} , which occurs if and only if a 0 {\displaystyle a\neq 0} . This shows that ( M N ) {\displaystyle (M\circ N)} is a positive definite matrix.

Proof using Gaussian integration

Case of M = N

Let X {\displaystyle X} be an n {\displaystyle n} -dimensional centered Gaussian random variable with covariance X i X j = M i j {\displaystyle \langle X_{i}X_{j}\rangle =M_{ij}} . Then the covariance matrix of X i 2 {\displaystyle X_{i}^{2}} and X j 2 {\displaystyle X_{j}^{2}} is

Cov ( X i 2 , X j 2 ) = X i 2 X j 2 X i 2 X j 2 {\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=\left\langle X_{i}^{2}X_{j}^{2}\right\rangle -\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle }

Using Wick's theorem to develop X i 2 X j 2 = 2 X i X j 2 + X i 2 X j 2 {\displaystyle \left\langle X_{i}^{2}X_{j}^{2}\right\rangle =2\left\langle X_{i}X_{j}\right\rangle ^{2}+\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle } we have

Cov ( X i 2 , X j 2 ) = 2 X i X j 2 = 2 M i j 2 {\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=2\left\langle X_{i}X_{j}\right\rangle ^{2}=2M_{ij}^{2}}

Since a covariance matrix is positive definite, this proves that the matrix with elements M i j 2 {\displaystyle M_{ij}^{2}} is a positive definite matrix.

General case

Let X {\displaystyle X} and Y {\displaystyle Y} be n {\displaystyle n} -dimensional centered Gaussian random variables with covariances X i X j = M i j {\displaystyle \left\langle X_{i}X_{j}\right\rangle =M_{ij}} , Y i Y j = N i j {\displaystyle \left\langle Y_{i}Y_{j}\right\rangle =N_{ij}} and independent from each other so that we have

X i Y j = 0 {\displaystyle \left\langle X_{i}Y_{j}\right\rangle =0} for any i , j {\displaystyle i,j}

Then the covariance matrix of X i Y i {\displaystyle X_{i}Y_{i}} and X j Y j {\displaystyle X_{j}Y_{j}} is

Cov ( X i Y i , X j Y j ) = X i Y i X j Y j X i Y i X j Y j {\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle -\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle }

Using Wick's theorem to develop

X i Y i X j Y j = X i X j Y i Y j + X i Y i X j Y j + X i Y j X j Y i {\displaystyle \left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle =\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle +\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle +\left\langle X_{i}Y_{j}\right\rangle \left\langle X_{j}Y_{i}\right\rangle }

and also using the independence of X {\displaystyle X} and Y {\displaystyle Y} , we have

Cov ( X i Y i , X j Y j ) = X i X j Y i Y j = M i j N i j {\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle =M_{ij}N_{ij}}

Since a covariance matrix is positive definite, this proves that the matrix with elements M i j N i j {\displaystyle M_{ij}N_{ij}} is a positive definite matrix.

Proof using eigendecomposition

Proof of positive semidefiniteness

Let M = μ i m i m i T {\displaystyle M=\sum \mu _{i}m_{i}m_{i}^{\textsf {T}}} and N = ν i n i n i T {\displaystyle N=\sum \nu _{i}n_{i}n_{i}^{\textsf {T}}} . Then

M N = i j μ i ν j ( m i m i T ) ( n j n j T ) = i j μ i ν j ( m i n j ) ( m i n j ) T {\displaystyle M\circ N=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}m_{i}^{\textsf {T}}\right)\circ \left(n_{j}n_{j}^{\textsf {T}}\right)=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}}

Each ( m i n j ) ( m i n j ) T {\displaystyle \left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}} is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also, μ i ν j > 0 {\displaystyle \mu _{i}\nu _{j}>0} thus the sum M N {\displaystyle M\circ N} is also positive semidefinite.

Proof of definiteness

To show that the result is positive definite requires even further proof. We shall show that for any vector a 0 {\displaystyle a\neq 0} , we have a T ( M N ) a > 0 {\displaystyle a^{\textsf {T}}(M\circ N)a>0} . Continuing as above, each a T ( m i n j ) ( m i n j ) T a 0 {\displaystyle a^{\textsf {T}}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}a\geq 0} , so it remains to show that there exist i {\displaystyle i} and j {\displaystyle j} for which corresponding term above is nonzero. For this we observe that

a T ( m i n j ) ( m i n j ) T a = ( k m i , k n j , k a k ) 2 {\displaystyle a^{\textsf {T}}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{\textsf {T}}a=\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}}

Since N {\displaystyle N} is positive definite, there is a j {\displaystyle j} for which n j a 0 {\displaystyle n_{j}\circ a\neq 0} (since otherwise n j T a = k ( n j a ) k = 0 {\displaystyle n_{j}^{\textsf {T}}a=\sum _{k}(n_{j}\circ a)_{k}=0} for all j {\displaystyle j} ), and likewise since M {\displaystyle M} is positive definite there exists an i {\displaystyle i} for which k m i , k ( n j a ) k = m i T ( n j a ) 0. {\displaystyle \sum _{k}m_{i,k}(n_{j}\circ a)_{k}=m_{i}^{\textsf {T}}(n_{j}\circ a)\neq 0.} However, this last sum is just k m i , k n j , k a k {\displaystyle \sum _{k}m_{i,k}n_{j,k}a_{k}} . Thus its square is positive. This completes the proof.

References

  1. ^ Schur, J. (1911). "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik. 1911 (140): 1–28. doi:10.1515/crll.1911.140.1. S2CID 120411177.
  2. ^ Zhang, Fuzhen, ed. (2005). The Schur Complement and Its Applications. Numerical Methods and Algorithms. Vol. 4. doi:10.1007/b105056. ISBN 0-387-24271-6., page 9, Ch. 0.6 Publication under J. Schur
  3. ^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society. 15 (2): 97–106. doi:10.1112/blms/15.2.97.
  • Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen at EUDML