Appendix - Matrices
A few elementary facts about matrices with real entries are recalled here. The purpose of this appendix is to provide a refresher of key concepts in matrix algebra that are needed for the main development of linear algebra in the notes. Many proofs are omitted, and the stress is on acquiring a working knowledge of matrix algebra.
Basic definitions
A matrix or order (read " cross ") is a rectangular array of real numbers, as shown below: Here are real numbers for every and . A horizontal section of the matrix is called a row, while a vertical section is called a column. Every matrix thus has rows and columns. The element in the row and column of this matrix is . It is conventional to call the index in as the row index and the index as the column index. We will typically use a symbol like to represent a matrix like the one shown above. The fact that the entry of is is written as follows: . Since the matrix has real numbers arranged as an rectangular array, we will refer to the set of all matrices using the notation . Thus, . An matrix is called a column vector, or just a vector, and a matrix is called a row vector. For row and column matrices, it is conventional to use the following shortcut notation: The entry of is written as instead of ; a similar comment applies for row vectors.
A matrix of the form is called a square matrix of order . An important example of a square matrix is the identity matrix of order , written , defined as follows: Given a square matrix , the ordered set of elements is called the leading diagonal of . The identity matrix thus has in each element of the leading diagonal and has entries zero everywhere else. An important important generalization of this kind of square matrix is a diagonal matrix, whose entries are all zero except on its leading diagonal. For instance, consider the matrix defined as follows: It is conventional to denote the matrix as . Using this notation, it can be see that .
Given an matrix , the transpose of is defined as the matrix whose entry is the entry of . It is conventional to denote the transpose of as . Thus, . An matrix is said to be symmetric if , and skew-symmetric, or antisymmetric, if . Here, is the matrix whose entry is : .
Example
Given any matrix , it is the case that . To see this note that whence it follows that , thereby proving the result.
Algebraic operations on matrices
Given two matrices , their sum is defined as the matrix such that Similarly, the scalar multiple of the matrix with a real number is defined as the matrix such that The product of two matrices is defined only under special conditions. Given an matrix and an matrix , the matrix product of and is the matrix defined as follows: Note that the product of two square matrices of the same order is always defined. Note further that for any , , where is the identity matrix of order .
Example
Suppose that we are given two matrices of order . Then, the following equation holds: To see this, note that if , then Notice also that Comparing these two expressions yields the identity .
Example
Any matrix can be written as the sum of a symmetric and skew-symmetric matrix. To see this, note that Define the matrices and as Equivalently, It follows that It is easy to check that is symmetric and is skew-symmetric.
Trace and determinant of square matrices
Suppose that is a square matrix of order . We will now define two important scalars associated with . The first, called the trace of , written , is defined as follows: Thus the trace of a given matrix is the sum of the elements on its leading diagonal.
Example
Suppose that is a symmetric matrix, and is a skew-symmetric matrix. Then . To see this note that Taking the trace on both sides, it follows that This shows that .
The second important scalar associated with a square matrix is its determinant. It is simpler to define the determinant of an matrix inductively. The determinant of a matrix , written , is defined as follows: . The determinant of a matrix is defined as follows: For any , the determinant of an matrix is defined inductively as follows. The minor of is defined as the matrix that is obtained by removing the row and column of . The determinant of is defined using the determinant of the minor as follows: for any , Note that in the equation above can be chosen to be any row index. The definition of the determinant is best illustrated with a few examples.
Example
Consider the following matrix : The determinant of is computed as follows: Note that the determinant is expanded using the first row here. It is left as a simple exercise to verify that the value of the determinant is the same irrespective of which row is chosen.
Inverse of a matrix
A matrix is said to be invertible if there exists another matrix such that . In this case, is conventional to write as .
An explicit formula for the inverse of an invertible square matrix can be provided. Define first the cofactor matrix of , written as follows: where is the minor of . The transpose of the cofactor matrix of is called the adjoint of , written : With these definitions in place, the inverse of an invertible square matrix can be written as Note that this also informs us that the matrix is invertible if and only if its determinant is non-zero. This fact is frequently used in applications.
Example
Suppose that are invertible matrices. Then . To show this, note that Similarly, it can be shown that , thereby proving the claim.
Example
As an elementary illustration of the computation of the inverse, let us now compute the inverse of the matrix , where The cofactor of is easily computed as The determinant of is computed easily as . Putting all this together, it follows from a simple computation that It can be checked by means of a direct substitution that this is indeed the inverse of .
Linear systems of equations
An important application where matrices naturally find use is the solution of linear systems of equations. To understand what this means, suppose that we are given constants and where . We are interested in finding real numbers that satisfy the following set of equations: These equations can be succinctly written as follows: for every , These equations can be written even more succinctly in the matrix form where Thus, the system of linear equations has been reduced to an equation involving matrices.
The system of equations does not always possess a unique solution. In the special case, however, when , the matrix is invertible, and a unique solution for the system of equations can be found as as can be checked with direct substitution.
Remark
It is to be noted that when is invertible, it is rare in practice to compute the solution of the set of linear equations using the relation . This is due to the fact that calculating the inverse is computationally expensive for large matrices.
Example
Perhaps the simplest method to solve the system of equations is the Gaussian elimination algorithm. The idea behind this algorithm is to systematically reduce the system of equations to the successive solution of equations with just one unknown variable. Rather than explaining the general approach, it is instructive to look at a specific example.
Suppose that we are interested in solving the following system of equations: It is assumed that the matrix whose entry is is invertible, and, without loss of generality, that Begin by multiplying the second equation in by . This yields the following modified set of equations Retaining the first equation as is and subtracting the first from the second equation, we get Solving the last equation for , we immediately see that Substituting this expression for in the first equation and solving for , we get, after some algebraic manipulation that Note that this is exactly the solution obtained by solving the system of equations using the formula .
The foregoing calculations can be visualized as follows. Begin by collecting together the elements of the matrices and as follows: The sequence of transformations carried out earlier can be summarized as follows: The matrix in the right hand side of the foregoing transformation is said to be in the upper triangular form and can be solved by back-substitution from the last equation upwards. The solution of a general linear system of equations is computed using an analogous and straightforward extension of this approach.
Example
As a concrete illustration of the Gaussian elimination algorithm, consider the solution of the following set of linear equations: The Gaussian elimination procedure can be illustrated in a sequence of two steps. In the first step, appropriate multiples of the first equations are used to eliminate from the second and third equations. In the second step, an appropriate multiple of the second equation is used to eliminate from the third equation. These steps are summarized below: These equations are solve by back-substitution as follows: We thus obtain the solution of the linear system of equations as , , and . The fact that this is indeed a solution of the linear system of equations presented above can be verified by direct substitution.
Eigenvalues and eigenvectors
To wrap up the discussion of elementary matrix algebra, let us consider the eigenvalue problem. Suppose that we are given a matrix of order . If there exists a vector and a real number such that then is called the eigenvalue of corresponding to the eigenvector .
Remark
Note that the zero vector trivially satisfies this equation for any choice of . We will exclude this trivial solution and assume henceforth that every eigenvector is non-zero.
Note that the equation can be written as the equation For this equation to have a non-trivial solution, it follows at once that This a polynomial equation in of degree , called the characteristic equation of , and has solutions in general. These solutions correspond to the eigenvalues of the matrix .
Example
Consider the matrix given as follows: Let us compute the eigenvalues of by computing its characteristic polynomial. Note first that the matrix has the following form: Computing the determinant of this equation and setting it to zero, the characteristic equation is obtained as This shows at once that the eigenvalues of the matrix are , , and .
The characteristic equation, when expanded fully, has the following structure: The constants are called the invariants of , and can be related to the eigenvalues of . Notably, the invariants and are computed as The other invariants of can be similarly related to the eigenvalues of , but they do not have a simple interpretation as and .
Example
For the matrix considered in the previous example, the characteristic equation takes the form It can be checked that Further more, for matrices of order three, it is true that as can be checked with a simple calculation.
The Cayley-Hamilton theorem states that the matrix satisfies the characteristic equation: This equation is often used in practice to simplify a variety of calculations. The following example provides an elementary illustration of the Cayley-Hamilton theorem.
Example
Suppose that is an invertible matrix. Then, its inverse can be computed using the Cayley-Hamilton theorem as follows: Notice how the fact that is used in the last step of this calculation. The Cayley-Hamilton theorem thus provides a convenient expression for the inverse of ; compare this with the general expression for the inverse provided earlier.
The eigenvectors of can be obtained by substituting the eigenvalues, successively in the equation . Notice that is an eigenvector of corresponding to the eigenvalue , then is also an eigenvector of corresponding to the same eigenvalue for any . This fact is often used to single out a particular value of , and hence a particular eigenvector. This is best illustrated with an example.
Example
Consider the matrix considered earlier: The eigenvalues of were computed earlier as . Let us now compute the corresponding eigenvectors.
Consider first the eigenvalue . Substituting this in the equation , we get In the equations above, we have used the notation . Notice that this matrix equation does not have a unique solution since . To compute all possible solutions, let us compute the components and in terms of . Begin by rewriting the first two equations as These equations can be readily solved to yield Thus, every vector of the form , where , is an eigenvector of corresponding to the eigenvalue . Different choices for can be chosen depending on the context. For instance, choosing , we obtain the eigenvector . Requiring that the eigenvector has unit length, on the other hand yields the eigenvector .
Remark
The dot product of two vectors , written , is defined as . Note that it follows from the definition that . The length of a vector is defined as .
The other eigenvectors of can be computed similarly. It is left as an exercise to check that and are the eigenvectors of corresponding to the eigenvalues and , respectively.
Suppose that are the eigenvectors of corresponding to the eigenvalues , respectively. Then the set of equations , where , can be written compactly as where Notice that the first column of is the first eigenvector of , the second column of is the second eigenvector of , and so on. It can be shown that if the eigenvectors are linearly independent (A proper definition of this term is provided in these notes in the general context of finite dimensional vector spaces.), then the matrix is invertible. In this case, This equation is called the eigendecomposition of , and plays an important role in many applications.
Example
Consider the matrix considered in the previous examples: The matrices and can be constructed as follows: Note that the matrix is invertible in this case. It is left as a simple exercise to check that thereby verifying the eigendecomposition of .
TO DO
Degenerate cases