Gershgorin Circle Theorem PDF

Gershgorin Circle Theorem Ayush Bohra Shiv Nadar University 30 September 2019 Introduction I Given an n × n matrix A

Views 82 Downloads 1 File size 275KB

Report DMCA / Copyright

DOWNLOAD FILE

Recommend stories

Citation preview

Gershgorin Circle Theorem Ayush Bohra Shiv Nadar University

30 September 2019

Introduction

I Given an n × n matrix A with entries from complex numbers, one of the fundamental questions that we can ask is about the eigenvalues of A.

Introduction

I Given an n × n matrix A with entries from complex numbers, one of the fundamental questions that we can ask is about the eigenvalues of A. I λ ∈ R is said to be an eigenvalue of A if ∃ a non zero vector x ∈ Rn such that Ax = λx.

Introduction

I Given an n × n matrix A with entries from complex numbers, one of the fundamental questions that we can ask is about the eigenvalues of A. I λ ∈ R is said to be an eigenvalue of A if ∃ a non zero vector x ∈ Rn such that Ax = λx. I One way to compute the eigenvalues is to find the roots of the polynomial equation det(λI − A) = 0, where I is the n × n identity matrix. However, this involves a lot of computation even for small values of n.

Introduction

I Given an n × n matrix A with entries from complex numbers, one of the fundamental questions that we can ask is about the eigenvalues of A. I λ ∈ R is said to be an eigenvalue of A if ∃ a non zero vector x ∈ Rn such that Ax = λx. I One way to compute the eigenvalues is to find the roots of the polynomial equation det(λI − A) = 0, where I is the n × n identity matrix. However, this involves a lot of computation even for small values of n. I So preceding direct computation, one may want to estimate the range of the eigenvalues of A.

Introduction

I Given an n × n matrix A with entries from complex numbers, one of the fundamental questions that we can ask is about the eigenvalues of A. I λ ∈ R is said to be an eigenvalue of A if ∃ a non zero vector x ∈ Rn such that Ax = λx. I One way to compute the eigenvalues is to find the roots of the polynomial equation det(λI − A) = 0, where I is the n × n identity matrix. However, this involves a lot of computation even for small values of n. I So preceding direct computation, one may want to estimate the range of the eigenvalues of A.

Strictly Diagonally Dominant Matrices

Definition A ∈ Mn (C) is said to be a strictly diagonally Pdominant matrix (SDD) if ∀i ∈ {1, 2, ..., n}, we have |ai i | > |ai j |. i6=j

Strictly Diagonally Dominant Matrices

Definition A ∈ Mn (C) is said to be a strictly diagonally Pdominant matrix (SDD) if ∀i ∈ {1, 2, ..., n}, we have |ai i | > |ai j |. i6=j

That is, if for every row, the absolute value of the diagonal element outweighs the sum of absolute values of the off-diagonal elements, then the matrix is a SDD matrix.

Strictly Diagonally Dominant Matrices

Definition A ∈ Mn (C) is said to be a strictly diagonally Pdominant matrix (SDD) if ∀i ∈ {1, 2, ..., n}, we have |ai i | > |ai j |. i6=j

That is, if for every row, the absolute value of the diagonal element outweighs the sum of absolute values of the off-diagonal elements, then the matrix is a SDD matrix.

Example 

 6 3 2 The matrix S = −1 4 0  is an example of SDD matrices. 2 5 −9

Invertibility of SDD Matrices

Theorem Let A ∈ Mn (C) be a strictly diagonally dominant matrix. Then A is invertible.

Invertibility of SDD Matrices

Theorem Let A ∈ Mn (C) be a strictly diagonally dominant matrix. Then A is invertible.

Proof. We prove this by contradiction. Let A ∈ Mn (C) be SDD and singular. Consider the homogenous system of equations represented by Ax = 00 where 00 denotes the zero vector in Cn .

Invertibility of SDD Matrices

Theorem Let A ∈ Mn (C) be a strictly diagonally dominant matrix. Then A is invertible.

Proof. We prove this by contradiction. Let A ∈ Mn (C) be SDD and singular. Consider the homogenous system of equations represented by Ax = 00 where 00 denotes the zero vector in Cn .Since A is singular, ∃x ∈ Cn , x 6= 00 : Ax = 00 . x has some entry, say xi which has the highest absolute value out of all the other entries.

Invertibility of SDD Matrices (Contd)

Proof. For the i in xi we have :

P j

aij xj = 0

Invertibility of SDD Matrices (Contd)

Proof. For the i in xi we have : P ⇒ aii xi = − aij xj j6=i

P j

aij xj = 0

Invertibility of SDD Matrices (Contd)

Proof. P For the i in xi we have : aij xj = 0 j P xj P ⇒ aii xi = − aij xj ⇒ aii = − xi aij j6=i

j6=i

Invertibility of SDD Matrices (Contd)

Proof. P For the i in xi we have : aij xj = 0 j P xj P ⇒ aii xi = − aij xj ⇒ aii = − xi aij j6=i P j6=i ⇒ |aii | ≤ |aij | j6=i

Invertibility of SDD Matrices (Contd)

Proof. P For the i in xi we have : aij xj = 0 j P xj P ⇒ aii xi = − aij xj ⇒ aii = − xi aij j6=i P j6=i ⇒ |aii | ≤ |aij | j6=i

This contradicts the strict diagonal dominance of A. Thus @x 6= 00 ∈ Cn such that Ax = 00

Invertibility of SDD Matrices (Contd)

Proof. P For the i in xi we have : aij xj = 0 j P xj P ⇒ aii xi = − aij xj ⇒ aii = − xi aij j6=i P j6=i ⇒ |aii | ≤ |aij | j6=i

This contradicts the strict diagonal dominance of A. Thus @x 6= 00 ∈ Cn such that Ax = 00 ⇒ A is invertible.

Gershgorin Discs

Definition Let A ∈ Mn (C). For i ∈ {1, 2, ..., n} let Ri (A) =

P

|aij |. Let

j6=i

D(aii , Ri (A)) ⊂ C denote the closed disc centred at aii with radius Ri (A). We call such a disc the i th Gershgorin disc of A ∈ Mn (C).

Gershgorin’s Theorem

We are now ready to state and prove a result that helps estimate eigenvalues of any n × n matrix with entries from complex numbers.

Theorem Let A ∈ Mn (C). Then each eigenvalue of A lies within some Gershgorin disc D(aii , Ri (A)). P Equivalently, every eigenvalue λ satisfies |λ − aii | ≤ |ai j | for i6=j

some i ∈ {1, 2, ..., n}

Gershgorin’s Theorem (Contd)

Proof. Suppose ∃λo , an eigenvalue of A which does not lie in any Gershgorin disc D(aii , Ri (A)).

Gershgorin’s Theorem (Contd)

Proof. Suppose ∃λo , an eigenvalue of A which does not lie in any Gershgorin disc D(aii , Ri (A)). P ⇒ ∀i ∈ {1, 2, ..., n} : |λo − aii |  |ai j |. i6=j

Gershgorin’s Theorem (Contd)

Proof. Suppose ∃λo , an eigenvalue of A which does not lie in any Gershgorin disc D(aii , Ri (A)). P ⇒ ∀i ∈ {1, 2, ..., n} : |λo − aii |  |ai j |. i6=j

⇒ The matrix λo I − A is a strictly diagonally dominant matrix.

Gershgorin’s Theorem (Contd)

Proof. Suppose ∃λo , an eigenvalue of A which does not lie in any Gershgorin disc D(aii , Ri (A)). P ⇒ ∀i ∈ {1, 2, ..., n} : |λo − aii |  |ai j |. i6=j

⇒ The matrix λo I − A is a strictly diagonally dominant matrix. ⇒λo I − A is an invertible matrix.

Gershgorin’s Theorem (Contd)

Proof. Suppose ∃λo , an eigenvalue of A which does not lie in any Gershgorin disc D(aii , Ri (A)). P ⇒ ∀i ∈ {1, 2, ..., n} : |λo − aii |  |ai j |. i6=j

⇒ The matrix λo I − A is a strictly diagonally dominant matrix. ⇒λo I − A is an invertible matrix. ⇒λo cannot be an eigenvalue, a contradiction. Which completes the proof.

Further Remarks I Given A ∈ Mn (C), there are n Gershgorin discs in the complex plane, each one centred at a diagonal element aii of A with radius Ri (A).

Further Remarks I Given A ∈ Mn (C), there are n Gershgorin discs in the complex plane, each one centred at a diagonal element aii of A with radius Ri (A). I Gershgorin’s Theorem tells us that every eigenvalue of A must lie in one of the discs.

Further Remarks I Given A ∈ Mn (C), there are n Gershgorin discs in the complex plane, each one centred at a diagonal element aii of A with radius Ri (A). I Gershgorin’s Theorem tells us that every eigenvalue of A must lie in one of the discs. I However, the theorem does not say that every disc contains an eigenvalue.

Further Remarks I Given A ∈ Mn (C), there are n Gershgorin discs in the complex plane, each one centred at a diagonal element aii of A with radius Ri (A). I Gershgorin’s Theorem tells us that every eigenvalue of A must lie in one of the discs. I However, the theorem does not say that every disc contains an eigenvalue.   1 −1 I For example, the matrix has eigenvalues i, −i and 2 −1 both of them lie in the disc centred at (−1, 0) with radius 2. No eigenvalue of the above matrix lies in the disc centred at (1, 0) which has radius 1.

Strengthening of the Result

I From the last slide we see that each Gershgorin disc need not contain an eigenvalue.

Strengthening of the Result

I From the last slide we see that each Gershgorin disc need not contain an eigenvalue. I However also note that in the previous example, the two discs were not disjoint.

Theorem Let A ∈ Mn (C). If the union of k Gershgorin discs is disjoint from the union of the other n − k discs, then the former union contains exactly k eigenvalues and the latter contains n − k eigenvalues (counting multiplicities).

Proof. I Let A ∈ Mn (C). Let D be the diagonal matrix obtained from A by vanishing all off-diagonal elements. Define a function B : [0, 1] → Mn (C) by B(t) = (1 − t)D + tA.

Proof. I Let A ∈ Mn (C). Let D be the diagonal matrix obtained from A by vanishing all off-diagonal elements. Define a function B : [0, 1] → Mn (C) by B(t) = (1 − t)D + tA. I Note that this is actually keeping the diagonal elements same, but scaling the off-diagonal elements by t ∈ [0, 1].

Proof. I Let A ∈ Mn (C). Let D be the diagonal matrix obtained from A by vanishing all off-diagonal elements. Define a function B : [0, 1] → Mn (C) by B(t) = (1 − t)D + tA. I Note that this is actually keeping the diagonal elements same, but scaling the off-diagonal elements by t ∈ [0, 1]. I So ∀t ∈ [0, 1], the Gershgorin discs of B(t) have the same centre, only the radius is getting scaled by t. And the union of the k discs will still be disjoint from the union of the n − k discs.

Proof. I Let A ∈ Mn (C). Let D be the diagonal matrix obtained from A by vanishing all off-diagonal elements. Define a function B : [0, 1] → Mn (C) by B(t) = (1 − t)D + tA. I Note that this is actually keeping the diagonal elements same, but scaling the off-diagonal elements by t ∈ [0, 1]. I So ∀t ∈ [0, 1], the Gershgorin discs of B(t) have the same centre, only the radius is getting scaled by t. And the union of the k discs will still be disjoint from the union of the n − k discs. I B(0) = D has eigenvalues equal to diagonal entries. As t varies, it changes B(t), changing the eigenvalues as well.

Proof. I We now make use of the fact that eigenvalues are continuous functions of t. That is, if λ(t) = (λ1 (t), λ2 (t), ..., λn (t)) gives the n eigenvalues of B(t), then λ(t) is a continuous function of t.

Proof. I We now make use of the fact that eigenvalues are continuous functions of t. That is, if λ(t) = (λ1 (t), λ2 (t), ..., λn (t)) gives the n eigenvalues of B(t), then λ(t) is a continuous function of t. I This is because the characteristic polynomial (containing variables x, t) of B(t) is continuous on [0, 1] which means even the roots are continuous functions of t. But the roots of the characteristic polynomial of B(t) are exactly the eigenvalues of B(t).

Proof. I We now make use of the fact that eigenvalues are continuous functions of t. That is, if λ(t) = (λ1 (t), λ2 (t), ..., λn (t)) gives the n eigenvalues of B(t), then λ(t) is a continuous function of t. I This is because the characteristic polynomial (containing variables x, t) of B(t) is continuous on [0, 1] which means even the roots are continuous functions of t. But the roots of the characteristic polynomial of B(t) are exactly the eigenvalues of B(t). I Thus as t → 1, λ(t) traces a continuous path.

Proof. We are now almost through. I Let λi (t) be an eigenvalue of B(t) which is in the union of the k discs. We will show that it cannot ’move’ to the union of the other n − k discs as t varies.

Proof. We are now almost through. I Let λi (t) be an eigenvalue of B(t) which is in the union of the k discs. We will show that it cannot ’move’ to the union of the other n − k discs as t varies. I Since the two unions are disjoint, this implies that the distance between λi (t) and the n − k discs is d > 0. As t varies, so does the distance d(t) between the them.

Proof. We are now almost through. I Let λi (t) be an eigenvalue of B(t) which is in the union of the k discs. We will show that it cannot ’move’ to the union of the other n − k discs as t varies. I Since the two unions are disjoint, this implies that the distance between λi (t) and the n − k discs is d > 0. As t varies, so does the distance d(t) between the them. I Suppose λi (1) lies in the union of the n − k discs, then d(1) = 0. But the path traced by λi (t) is continuous. Thus ∃to ∈ [0, 1] : λi (to ) does not lie in any Gershgorin disc, contradicting Gershgorin’s theorem for B(to ).

Better Estimations I Note that we have proved all results by using the Gershgorin discs created from rows of the matrix.

Better Estimations I Note that we have proved all results by using the Gershgorin discs created from rows of the matrix. I Since the eigenvalues of A and AT are the same, we can obtain discs from rows of AT as well, which are nothing but the columns of A.

Better Estimations I Note that we have proved all results by using the Gershgorin discs created from rows of the matrix. I Since the eigenvalues of A and AT are the same, we can obtain discs from rows of AT as well, which are nothing but the columns of A. I If A is not a symmetric matrix then we will get different sets of discs. We can now improve our bound on the range of the eigenvalues by observing that every eigenvalue of A should lie in the intersection of the unions of the discs of A and AT .

Example

 10 2 3 Let A =  0 11 1  0 −1 13

Figure: Gershgorin Discs for matrix A

Figure: Gershgorin Discs for matrix AT

Figure: Shaded portion showning intersection.

References

I have referred to the following papers : I Gershgorin’s Circle Theorem for Estimating the Eigenvalues of a Matrix with Known Error Bounds by David Marquis : http://math.stmarys-ca.edu/wpcontent/uploads/2017/07/David-Marquis.pdf I Gershgorin’s Theorem for Estimating Eigenvalues by Sean Brakken-Thal : http://buzzard.ups.edu/courses/2007spring/projects/brakkenthalpaper.pdf