Linear Algebra Glossary

A-Orthogonal Vectors
Adjacency Matrix
Adjoint Matrix
Alternating Sign Matrix
Anticirculant Matrix
Antisymmetric Matrix
Arnoldi Iteration
Band Matrix
Band Matrix Storage
Bandwidth
Basis
Bidiagonal Matrix
Basic Linear Algebra Subprograms (BLAS)
Block Matrix
Border Banded Matrix
Cartesian_Basis_Vectors
The Cauchy-Schwarz Inequality
The Cayley-Hamilton Theorem
CentroSymmetric Matrix
Characteristic Equation
Cholesky Factorization
Circulant Matrix
Cofactor Matrix
Column Echelon Form
Commuting Matrices
Companion Matrix
Compatible Norms
Complex Number Representation
Condition Number
Congruent Matrix
Conjugate Gradient Method
Conjugate Matrix
Conjugate of a Complex Number
Consistent System
Conjunctive Matrix
Convergent Matrix
Correlation Matrix
Cross Product
Cyclic Reduction
Cyclic Tridiagonal Matrix
Defective Matrix
Deflation
Derogatory Matrix
Determinant of a Matrix
Diagonal Dominance
Diagonal Matrix
Diagonalizable Matrix
Downshift Matrix
Eigenvalues
Eigenvectors
EISPACK
EISPACK Matrix Norm
Elementary Column Operations
Elementary Matrix
Elementary Row Operations
Ellipsoids
Equilibration
Equivalent Matrix
Exchange Matrix
External Storage Algorithms
Fourier Matrix
Frobenius Matrix Norm
Gauss Elimination
Gauss Jordan Elimination
Gauss Seidel Iteration For Linear Equations
General Matrix Storage
Generalized Permutation Matrix
Gershgorin Disks
Givens Rotation Matrix
Gram Matrix
Gram Schmidt Orthogonalization
Hadamard's Inequality
Hadamard Product
Hamiltonian Matrix
Hankel Matrix
Harwell Boeing Sparse Matrix Collection (HBSMC)
HBSMC Sparse Matrix Storage
HBSMC Finite Element Matrix Storage
Hermite Normal Form
Hermitian Matrix
Hessenberg Matrix
Hölder Vector Norms (p-norms)
Householder Matrix
Idempotent Matrix
The Identity Matrix
Ill Conditioned Linear System
The Incomplete Cholesky Factorization
The Incomplete LU Factorization
Inertia
Inner Product
Integer Matrix
Inverse Matrix
Inverse Power Method
Invertible Matrix
Involutory Matrix
Irreducible Matrix
Iterative Methods for Eigenvalues
Iterative Methods for Linear Equations
Iterative Refinement
The Jacobi Algorithm for Eigenvalues
The Jacobi Algorithm for Linear Equations
The Jacobi Preconditioner
Jordan Canonical Form
Kershaw Matrix
Krylov Matrix
L Matrix
L1 Matrix Norm
L1 Vector Norm
L2 Matrix Norm
L2 Vector Norm
L Infinity Matrix Norm
L Infinity Vector Norm
The Lanczos Iteration
LAPACK
LDL Factorization
Linear Dependence
Linear Least Squares
Linear Space
Linear Transformation
LINPACK
Lonesum Matrix
LU Factorization
M Matrix
Magic Matrix
Matrix Exponential
Matrix Factorization
Matrix Multiplication
Matrix Norm
Matrix_Order
Matrix Properties
Matrix Rank
Matrix Splitting
Matrix Storage
Matrix Structure
Matrix Symmetry
Minimal Polynomial
Minor Matrix
Monic Polynomial
MPS Storage Format
Multiplicity
The Neumann Series
Nilpotent Matrix
Nonnegative Matrix
Normal Equations
Normal Matrix
Null Space
Null Vector
Ones Matrix
Order of a Matrix
Orthogonal Matrix
Orthogonal Similarity Transformation
Orthogonal Vectors
Orthonormal Vectors
Outer Product
Overdetermined System
P Matrix
Permanent of a Matrix
Permutation Matrix
Perron-Frobenius Theorem
Persymmetric Matrix
Pivoting
Polar Decomposition
Positive definite matrix
Positive Matrix
The Power Method
Preconditioner
Primitive Matrix
Projection Matrix
Property A
The Pseudoinverse
QR Factorization
QR Method for Eigenvalues
Quaternion Representation
Rank One Matrix
Rank One Update
Rayleigh Quotient
Rectangular Matrix
Reduced Column Echelon Form
Reduced Row Echelon Form
Reflection Matrix
Residual Error
Root of Unity
Rotation
Row Echelon Form
Row Rank
Row Space
Row Sum
The Schur Complement
The Schur Decomposition
Sherman Morrison Formula
Sign Symmetric Matrix
Similar Matrix
Simple Eigenvalue
Singular Matrix
Singular Value Decomposition
Skew CentroSymmetric Matrix
Skew Circulant Matrix
Skew Hermitian Matrix
Skyline Matrix Storage
Span of a Set of Vectors
The Sparse BLAS
Sparse Matrix
Sparse Matrix Storage
SPARSKIT
Spectral Radius
Spectral condition number
Spectrum
Square Matrix
Square Root of a Matrix
Stochastic Matrix
Strassen's Algorithm
Submatrix
Successive Overrelaxation Method (SOR)
Summable Matrix
Symmetric Matrix
Symmetric Matrix Storage
TEST_MATRIX
Toeplitz Matrix
Tournament Matrix
Tournament Win Matrix
Trace of a Matrix
Transition Matrix
Transpose
Trapezoidal Matrix
Triangular Matrix
Tridiagonal Matrix
Undetermined System
Unimodular Matrix
Unitary Matrix
Unitary Similarity Transformation
Unreduced Matrix
Upshift Matrix
Vector-Bound Matrix Norm
Vector Norm
Z Matrix
Zero Matrix
Zero One Matrix

A-Orthogonal Vectors

Two vectors u and v are said to be A-orthogonal if

( u, A * v ) = 0.

Here A should be a positive definite symmetric matrix, which in turn guarantees that the expression ( u, A * v ) may be regarded as an inner product of the vectors u and v, with the usual properties.

This concept is useful in the analysis of the conjugate gradient method.

Adjacency Matrix

An adjacency matrix of an (undirected) graph is a matrix whose order is the number of nodes, and whose entries record which nodes are connected to each other by a link or edge of the graph.

If two nodes I and J are connected by an edge, then A_i,j=A_j,i=1. All other entries of the matrix are 0. Thus, an adjacency matrix is a zero-one matrix. The usual convention is that a node is not connected to itself, and hence the diagonal of the matrix is zero.

The product A²=A*A is a matrix which records the number of paths between nodes I and J. If it is possible to reach one node from another, it must be possible in a path of no more than n-1 links. Hence, the reachability matrix, which records whether it is possible to get from node I to node J in one or more steps, can be determined by taking the logical sum of the matrices I, A, A², ..., A^n-1.

(Loops:) If edges are allowed which begin and end at the same node, (sometimes called loops) then the adjacency matrix can have some 1's on the diagonal.

(Directed graphs:) If edges are assigned a direction, then it is possible that there is an edge from node I to node J, but not in the reverse direrection. In this case of a "directed graph", an adjacency matrix can be defined as well. The only difference is that the adjacency matrix now need not be symmetric. The reachability matrix can still be defined in the same way.

(Multigraphs:) If multiple edges are allowed between two nodes, (which might be distinguished by color in a drawing), then the adjacency matrix can record this information by storing in A_i,j the number of edges between the two nodes. The reachability matrix can still be defined in the same way, and still counts the number of different routes correctly.

Adjoint Matrix

The adjoint matrix of a square matrix A has the property that:

A * adjoint ( A ) = adjoint ( A ) * A = det(A) * I.

Note that, if A is not invertible, then det(A)=0. We can take the convention that in this case, the adjoint of the matrix is the zero matrix.

The adjoint of a matrix A is "almost" its inverse. If we know that A is invertible, then the inverse of A, denoted inverse ( A ) or A^-1, can be written explicitly as:

A^-1 = ( 1 / det(A) ) * adjoint ( A ).

This formula is only of use if we can figure out how to evaluate the things on the right hand side. The first factor is not so bad, since we do know a formula for the determinant of a matrix.

But there is also a way of evaluating the second factor. The adjoint matrix is defined in terms of the cofactor matrix of A:

adjoint ( A ) = ( cofactor ( A ) )'.

Alternating Sign Matrix

An alternating sign matrix is an integer matrix with the properties that

the entries are only 0, +1, or -1;
each row sum is 1;
each column sum is 1;
the first (and last) nonzero entry in each row and column is 1;
the nonzero entries in any row or column alternate in sign.

Example:

         0   1   0   0   0
         1  -1   0   1   0
         0   1   0  -1   1
         0   0   0   1   0
         0   0   1   0   0

Obviously, alternating sign matrices include the identity matrix and any permutation matrix. From the definitions, you should see that the first row must contain a single entry which is 1, with the other values being 0. The nonzeroes in the second row can be a single 1, or the values, 1, -1, 1, in that order, with some intervening zeroes possible. In the third row, the value -1 may occur up to as many times as there were 1's in preceding rows, which means the most interesting row could be 1, -1, 1, -1, 1, -1, 1. Thus the number of possible nonzeroes grows until the central row of the matrix is reached. Since the same restrictions apply from the bottom reading up, the number of possible nonzeroes must now decrease. Similar reasoning controls the nonzero population of the columns.

If we let A_n denote the number of distinct alternating sign matrices of order n, then it has only recently been proved that

A_n = Product ( 0 <= I <= N-1 ) (3*I+1)! / (N+I)!

giving the sequence 1, 2, 7, 42, 429, 7436, 218348, 10850216, ...

Reference:

David Robbins,
The Story of 1, 2, 7, 42, 429, 7436, ...,
Mathematical Intelligencer,
Volume 13, Number 2, pages 12-19.
David Bressoud,
Proofs and Confirmations: The Story of the Alternating Sign Matrix Conjecture,
Cambridge University Press, 1999.

Anticirculant Matrix

An anticirculant matrix is a matrix whose first row of values is repeated in each successive row, shifted one position to the left, with the first value "wrapping around" to the end.

Here is an example of an anticirculant matrix:

and here is an example of a rectangular anticirculant matrix:

        1 2 3 4 5
        2 3 4 5 1
        3 4 5 1 2

Simple facts about a anticirculant matrix A:

A is constant along any antidiagonal;
A is a special kind of Hankel matrix;
If A is square, then A is normal, hence unitarily diagonalizable.
If A is square, then the vector (1,1,...,1) is an eigenvector of the matrix, with eigenvalue equal to the constant row sum.

Antisymmetric Matrix

A square matrix A is antisymmetric if it is equal to the negative of its transpose:

A = - A'.

Every matrix A can be decomposed into the sum of an antisymmetric and a symmetric matrix:

A = B + C = (1/2) * ( ( A - A' ) + ( A + A' ) )

Simple facts about an antisymmetric matrix A:

A has a zero diagonal;
A has pure imaginary eigenvalues;
A is normal, hence unitarily diagonalizable.
I - A is not singular;
The matrix ( I + A ) * Inverse ( I - A ) is orthogonal;
If the order of A is odd, then the determinant is 0.

An antisymmetric matrix is also called skew symmetric.

In complex arithmetic, the corresponding object is a skew Hermitian matrix.

Arnoldi Iteration

The Arnoldi iteration is an iterative method that seeks to approximate some of the eigenvalues of a very large matrix.

If the matrix is symmetric or Hermitian, then the Arnoldi iteration is usually reformulated into the more efficient Lanczos iteration.

We start by considering the unitary similarity transform that converts the m by m matrix A into upper Hessenberg form:


        A = Q H Q*

Now we let Q_n indicate the m by n matrix formed from the first n columns of Q; similarly, H_n is the n+1 by n+1 submatrix of H. Then we write


        A Q_n = Q_n+1 H_n

In particular, we can look at the subsystem involving the n-th column of Q_n:


        A*q_n = h_1,n q₁ + ... 
          + h_n,n q_n
          + h_n+1,n q_n+1

This can be interpreted as a recursive relationship for q_n+1 in terms of the previous vectors. The Arnoldi iteration allows us to build up the matrices Q_n and H_n. For reasons not completely understood, even for n much smaller than m, the eigenvalues of H_n will tend to some of the eigenvalues of A, and typically the extreme ones.

The new vector q_n+1 is a Krylov vector; actually, because of orthogonalization, it represents the new component introduced into the Krylov subspace by the product A*q_n.

Thus, a typical procedure is to compute H_n and apply a standard eigenvalue procedure for Hessenberg matrices to extract its eigenvalues.

An efficient form for the Arnoldi iteration is:

        Let b be an arbitrary initial vector;
        Set q(1) = b / ||b||;
        for n = 1, ...
          v = A * q(n)
          for j = 1 to n
            h(j,n) = conjugate ( q(j) ) * v
            v = v - q(j) * h(j,n)
            h(n+1,n) = ||v||
            q(n+1) = v / h(n+1,n)
          end for
        end for

Band Matrix

A band matrix is a matrix whose entries are all zero except for the diagonal and a few of the immediately adjacent diagonals.

If the band matrix is large enough, then many significant efficiencies can be achieved in storage and matrix operations. Because so many elements are zero, the band matrix can be stored more compactly using band matrix storage. And because so many elements are zero, many algorithms can be speeded up to execute more quickly, including matrix multiplication and Gauss elimination.

Special cases of band matrices include those which are also symmetric or positive definite.

Band matrices with very tight bandwidths are of interest, and include matrices which are diagonal, bidiagonal, tridiagonal, and in some cases pentadiagonal.

Here is an example of a band matrix:

        11 12  0  0  0  0
        21 22 23  0  0  0
        31 32 33 34  0  0
         0 42 43 44 45  0
         0  0 53 54 55 56
         0  0  0 64 65 66

This matrix has an upper bandwidth of 1, and lower bandwidth of 2, and an overall bandwidth of 4.

In some cases, a band matrix can be LU-factored without pivoting. This is always true if the matrix is positive definite. If a banded matrix A with lower and upper bandwidths ml and mu can be LU factored without pivoting, then its factorization can be written as

        A = L * U

The usual P factor is missing because there is no pivoting, but more is true. The factor L, which is unit lower triangular, is in fact also a band matrix, with lower bandwith ml, and U, the upper triangular factor, is also band matrix with upper bandwith mu.

If the matrix A is positive definite symmetric banded, with a half-bandwidth of m, then the banded LU factorization becomes:

        A = L * U

with L and U having m as their lower and upper bandwidths respectively. Moreover, if we "factor out" the non-unit diagonal of U by writing

        U = D * V

where V is unit upper triangular, we may write

        A = L * U = L * D * V = L * D2 * D2 * V = R' * R

where D2 is the obvious "square root" of D (which must have positive entries because A is positive definite, remember!), and R is an upper triangular banded matrix with upper bandwidth m. Another way of looking at this is to say that the Cholesky factor of a banded positive definite symmetric matrix is itself banded with the same (half) bandwidth as the original matrix.

LAPACK and LINPACK include special routines for a variety of band matrices. These routines can compute the LU factorization, the determinant, inverse, or solution of a linear system.

LAPACK and EISPACK have routines for computing the eigenvalues of a symmetric banded matrix.

Band Matrix Storage

Band matrix storage is a matrix storage format suitable for efficiently storing the nonzero entries of a band matrix.

Most band storage schemes are column oriented. The nonzero entries of the matrix "slide" downwards, while remaining in their original column. If the original matrix looked like this:

        11 12  0  0  0  0
        21 22 23  0  0  0
        31 32 33 34  0  0
         0 42 43 44 45  0
         0  0 53 54 55 56
         0  0  0 64 65 66

the matrix would be saved in column band matrix storage as:

 
         0 12 23 34  0  0
        11 22 33 44 45  0
        21 32 43 54 55 56
        31 42 53 64 65 66

Note that the zeroes in the above array are there just as padding. They don't correspond to any entries of the original array, and are simply necessary to make the array rectangular.

If the matrix is to be handled by a Gauss elimination routine that uses pivoting, then there is a possibility of fill in; that is, nonzero entries may need to be stored in places where zeroes had been. Band matrix storage can still be used, but we need to include in the compressed matrix some extra entries representing the diagonals along which the fill in entries may occur. It turns out that the number of extra diagonals required is simply the number of nonzero subdiagonals in the original matrix. For our example, this would mean the matrix would be stored as:

 
         0  0  0  0  0  0
         0  0  0  0  0  0
         0 12 23 34  0  0
        11 22 33 44 45  0
        21 32 43 54 55 56
        31 42 53 64 65 66

Bandwidth

The bandwidth of a band matrix is, roughly speaking, the number of diagonals that contain nonzero entries.

More precisely, define ML, the lower bandwidth of a matrix A to be the maximum value of ( I - J ), and MU to be the maximum value of ( I - J ), for all nonzero matrix entries A_i,j. Then the bandwidth M is defined by:

M = ML + 1 + MU.

This definition always treats the (main) diagonal as nonzero, and is not misled by a matrix which has only two nonzero diagonals, which are actually widely separated. All the territory between the diagonals must be included when measuring bandwidth.

Basis

A basis for a linear space X of dimension N is a set of N vectors, {v(i) | 1 <= i <= N } from which all the elements of X can be constructed by linear combinations.

Naturally, we require that each of the vectors v(i) be an element of the space X. Moreover, it is not enough that these vectors span the space; we also require that they be linearly independent, that is, there should be no redundant vectors in the set.

The columns of the identity matrix form a basis for the linear space of vectors of dimension N. A square matrix of order N is not defective exactly when its eigenvectors form a basis for the linear space of vectors of dimension N.

Given a particular basis, the representation of a vector x is the unique set of coefficients c(i) so that

x = Sum ( 1 <= I <= N ) c(i) * v(i)

The coefficients must be unique, otherwise you can prove that the basis is not linearly independent!

If the basis vectors are pairwise orthogonal, then the basis is called an orthogonal basis. If the basis vectors have unit length in the Euclidean norm, the basis is a normal basis. If both properties apply, it is an orthonormal basis. The columns of an orthogonal matrix are an orthonormal basis for the linear space of vectors of dimension N.

Bidiagonal Matrix

A bidiagonal matrix has only two nonzero diagonals. The matrix is called upper bidiagonal if these are the main diagonal and the principal superdiagonal. The matrix is called lower bidiagonal if these are the main diagonal and the principal subdiagonal.

A simple example of an upper bidiagonal matrix is:

The Jordan Canonical Form is an example of an upper bidiagonal matrix.

A bidiagonal matrix is automatically a:

tridiagonal matrix;
triangular matrix;
band matrix.

with all the rights and privileges appertaining thereunto.

If a tridiagonal matrix can be LU-factored without pivoting, then the L factor is lower bidiagonal, and the U factor is upper bidiagonal.

Basic Linear Algebra Subprograms (BLAS)

The BLAS or Basic Linear Algebra Subprograms, are a set of routines offering vector and matrix utilities. They are extensively used as part of LINPACK and LAPACK, to simplify algorithms, and to make them run more quickly.

The Level 1 BLAS provide basic vector operations such as the dot product, vector norm, and scaling. Level 2 BLAS provide operations involving a matrix and a vector, and Level 3 BLAS provide matrix-matrix operations.

There are also sets of sparse BLAS and parallel BLAS available.

Here are the Level 1 BLAS routines for single precision real vectors:

ISAMAX returns the index of maximum absolute value in vector SX;
SASUM computes the sum of absolute values of vector SX;
SAXPY adds a scalar multiple of one vector to another;
SCOPY copies vector SX into SY;
SDOT computes the dot product of two vectors;
SMACH estimates the largest and smallest machine values, and roundoff value;
SNRM2 computes the Euclidean norm of a vector SX;
SROT applies a Givens rotation;
SROTG generates a Givens Rotation;
SSCAL scales a vector by a constant;
SSWAP interchanges two vectors SX and SY.

Block Matrix

A block matrix is a matrix which is described as being built up of smaller matrices.

For example, a tridiagonal block matrix might look like this:

        2 4 | 3 9 | 0 0
        4 6 | 0 3 | 0 0
        ---------------
        1 0 | 2 4 | 3 9
        5 5 | 4 6 | 0 3
        ---------------
        0 0 | 1 0 | 2 4
        0 0 | 5 5 | 4 6

but for certain purposes, it might help us to see this matrix as "really" being a tridiagonal matrix, whose elements are themselves little matrices:

        a | b | 0
        ---------
        c | a | b
        ---------
        0 | c | a

An algorithm suitable for a tridiagonal matrix can often be extended, in a natural manner, to handle a block tridiagonal matrix. Similar extensions can be made in some cases for other types of block matrices. A block banded matrix can be factored by a variant of banded Gauss elimination, for instance.

Border Banded Matrix

A border banded matrix is a 2 by 2 block matrix comprising a (large) leading block which is a square banded matrix, two dense rectangular side strips, and a (small) trailing block which is a square dense matrix.

For example, a "toy" border banded matrix might look like this:

         2 -1  0  0  0  0  0 | 1 2
        -1  2 -1  0  0  0  0 | 2 5
         0 -1  2 -1  0  0  0 | 7 8
         0  0 -1  2 -1  0  0 | 3 3
         0  0  0 -1  2 -1  0 | 4 2
         0  0  0  0 -1  2 -1 | 3 1
         0  0  0  0  0 -1  2 | 7 8
         -------------------------
         3  7  8  3  2  3  1 | 5 2
         1  2  4  7  9  2  4 | 3 6

which we can regard as being made up of the blocks:

A11 | A12
---------
A21 | A22

where, as we specified, A11 is a square banded matrix, A22 is a square dense matrix, and A21 and A12 are rectangular strips.

It is desirable to take advantage of the banded structure of A11. We can specify an algorithm for solving a linear system A * x = b that can be written in terms of operations involving the sub-matrices A11, A12, A21 and A22, which will achieve this goal, at the expense of a little extra work. One problem with this technique is that it will fail if certain combinations of the matrices A11 and A22 are singular, which can happen even when A is not singular.

The algorithm for solving A * x = b rewrites the system as:

A11 * X1 + A12 * X2 = B1
A21 * X1 + A22 * X2 = B2

The first equation can be solved for X1 in terms of X2:

X1 = - A11^-1 * A12 * X2 + A11^-1 * B1

allowing us to rewrite the second equation for X2:

( A22 - A21 * A11^-1 * A12 ) X2 = B2 - A21 * A11^-1 * B1

which can be solved as:

X2 = ( A22 - A21 * A11^-1 * A12 )^-1 * ( B2 - A21 * A11^-1 * B1 )

The actual algorithm doesn't compute the inverse, of course, but rather factors the matrices A11 and A22 - A21 * A11^-1 * A12.

Cartesian Basis Vectors

The Cartesian basis vectors are simply the N columns of the identity matrix, regarded as individual column vectors.

These vectors form the standard basis for the set of vectors R^N. The vector corresponding to the I-th column of the identity matrix is often symbolized by e_i.

Thus, if we are working in a space with dimension N of 4, the basis vectors e₁ through e₄ would be:

        1     0     0    0
        0     1     0    0
        0     0     1    0
        0     0     0    1

Facts about the Cartesian basis vectors:

The vectors have unit Euclidean norm, are pairwise orthogonal, and form a basis for the space of N dimensional vectors;
A * e_j yields a vector which is the J-th column of the matrix A;
the outer product of e_i and e_j a matrix A which is all zero except for the single entry A_i,j=1.

The Cauchy-Schwarz Inequality

The Cauchy-Schwarz Inequality is a relationship between a vector inner product and a vector norm derived from that inner product. In particular, if the norm ||*|| is defined by an inner product (*,*) as follows:

|| x || = sqrt ( x, x ),

then the Cauchy-Schwarz inequality guarantees that for any vectors x and y it is the case that:

| ( x, y ) | <= || x || * || y ||.

The Cayley-Hamilton Theorem

The Cayley-Hamilton Theorem guarantees that every (square) matrix satisfies its own characteristic equation.

For example, if A is the matrix:

        2 3
        1 4

then the characteristic equation is

lambda² - 6 * lambda + 5 = 0.

which is not true for all values lambda, but just a few special values known as eigenvalues. The Cayley-Hamilton theorem guarantees that the matrix version of the characteristic equation, with A taking the place of lambda, is guaranteed to be true:

A² - 6 * A + 5 * I = 0.

CentroSymmetric Matrix

A centrosymmetric matrix is one which is symmetric about its center; that is,

A_i,j = A_m+1-i,n+1-j

Example:

         1 10  8 11  5
        13  2  9  4 12
         6  7  3  7  6
        12  4  9  2 13
         5 11  8 10  1

A centrosymmetric matrix A satisfies the following equation involving the Exchange matrix J:

J*A*J = A

Characteristic Equation

The characteristic equation of a (square) matrix A is the polynomial equation:

det ( A - lambda * I ) = 0

where lambda is an unknown scalar value.

The left hand side of the equation is known as the characteristic polynomial of the matrix. If A is of order N, then there are N roots of the characteristic equation, possibly repeated, and possibly complex.

For example, if A is the matrix:

        2 3
        1 4

then the characteristic equation is

            (    2 - lambda     3           )
        det (    1              4 - lambda  )  = 0

lambda² - 6 * lambda + 5 = 0.

This equation has roots lambda = 1 or 5.

Values of the scalar lambda which satisfy the characteristic equation are known as eigenvalues of the matrix.

Some facts about the characteristic equation of A:

A and A' have the same characteristic equation.

The Cayley-Hamilton Theorem guarantees that the matrix itself also satisfies the matrix version of its characteristic equation.

Cholesky Factorization

The Cholesky factorization of a positive semidefinite symmetric matrix A has the form:

A = L * L'

where L is a lower triangular matrix, or equivalently:

A = R' * R

where R is an upper triangular matrix.

If a matrix is symmetric, then it is possible to determine whether or not the matrix is positive definite simply by trying to compute its Cholesky factorization: if the matrix has a zero eigenvalue, then it is positive semidefinite, and the algorithm should theoretically spot this by computing a zero diagonal element; if the matrix actually has a negative eigenvalue, then at a particular point in the algorithm, the square root of a negative number will be computed.

Software to compute the Cholesky factorization often saves space by using symmetric matrix storage, and overwriting the original matrix A by its Cholesky factor L.

As long as the matrix A is positive definite, the Cholesky factorization can be computed from an LDL factorization, or, if no pivoting was done, from an LU factorization.

The Cholesky factorization can be used to compute the square root of the matrix.

The LINPACK routines SCHDC, SCHUD, SCHDD, SCHEX, SPOCO, SPOFA, SPODI, and SPOSL compute and use the Cholesky factorization.

Circulant Matrix

A circulant matrix is a matrix whose first row of values is repeated in each successive row, shifted one position to the right, with the end value "wrapping around".

Here is a square circulant matrix:

a "wide" rectangular circulant matrix:

        1 2 3 4 5
        5 1 2 3 4
        4 5 1 2 3

a "tall" rectangular circulant matrix:

Simple facts about a (rectangular) circulant matrix A:

A is constant along any diagonal;
A is persymmetric;
A is a special kind of Toeplitz matrix;

Simple facts about a square circulant matrix A:

The Identity matrix is a circulant matrix;
A is normal, hence unitarily diagonalizable.
The product of two circulant matrices is a circulant matrix.
The inverse of a (nonsingular) circulant matrix is a circulant matrix.
Any two circulant matrices commute;
The transpose of a circulant matrix is a circulant matrix.
Every circulant matrix is diagonalized by the Fourier matrix.
The columns of the Fourier matrix are the eigenvectors of (every) the circulant matrix.
The vector (1,1,...,1) is an eigenvector of the matrix, with eigenvalue equal to the row sum.
If W is an N-th root of unity, then
Y = A(1,1) + A(1,2)*W + A(1,3)*W² + ... + A(1,N)*W^N-1
is an eigenvalue of A, with (right) eigenvector:
( 1, W, W², ..., W^N-1 )
and left eigenvector:
( W^N-1, W^N-2, ..., W², W, 1 ).
Although there are exactly N distinct N-th roots of unity, the circulant may have repeated eigenvalues, because of the behavior of the polynomial. However, the matrix is guaranteed to have N linearly independent eigenvectors.

Compare the concept of an anticirculant matrix.

Cofactor Matrix

The cofactor matrix of a square matrix A is generally used to define the adjoint matrix, or to represent the determinant.

For a given matrix A, the cofactor matrix is the transpose of the adjoint matrix:

cofactor ( A ) = ( adjoint ( A ) )'

The determinant det(A) can be represented as the product of each of the entries of any given row or column times their corresponding cofactor entries. In particular, consider the first row:

det(A) = A(1,1) * cofactor(A)(1,1) + A(1,2) * cofactor(A)(1,2) + ... + A(1,N) * cofactor(A)(1,N)

The formula for the (I,J) entry of the cofactor matrix of A is:

cofactor(A)(I,J) = (-1)^(I+J) * det ( M(A,I,J) )

where M(A,I,J) is the minor matrix of A, constructed by deleting row I and column J.

Column Echelon Form

Column echelon form is a special matrix structure which is usually arrived at by Gauss elimination.

Any matrix can be transformed into this form, using a series of elementary column operations. Once the form is computed, it is easy to compute the determinant, inverse, the solution of linear systems (even for underdetermined or overdetermined systems), the rank, and solutions to linear programming problems.

A matrix (whether square or rectangular) is in column echelon form if:

Each nonzero column of the matrix has a 1 as its first nonzero entry.
The leading 1 in a given column occurs in a row below the leading 1 in the previous column.
Columns that are completely zero occur last.

Column echelon form is primarily of use for teaching, and analysis of small problems, using exact arithmetic. It is of little interest numerically, because very slight errors in numeric representation or arithmetic can result in completely erroneous results.

Commuting Matrices

Two square matrices, A and B, are said to commute if

A * B = B * A

Facts about commuting matrices:

If a (real) matrix commutes with its transpose, it is a normal matrix.
Every (invertible) matrix commutes with its inverse.
A diagonal matrix commutes with any other diagonal matrix.
Every circulant matrix commutes with all other circulant matrices of the same order.
A matrix commutes with any power of itself, and with any polynomial of itself.
If A and B commute, then they are simultaneously triangularizable. There is a matrix P such that inverse(P)*A*P and inverse(P)*B*P are both upper triangular.

Companion Matrix

The companion matrix for a monic polynomial P(X) of degree N is a matrix of order N whose characteristic polynomial is P(X).

If the polynomial P(X) is represented as:

P(X) = X^N + C(1) * X^(N-1) + ... + C(N-1) * X + C(N).

then the companion matrix for this polynomial has the form:

        0 0 0 ... 0 0 -C(N)
        1 0 0 ... 0 0 -C(N-1)
        0 1 0 ... 0 0 -C(N-2)
        ...................
        0 0 0 ... 1 0 -C(2)
        0 0 0 ... 0 1 -C(1)

Note that the determinant of the companion matrix is either -C(N) or +C(N), so it is never zero unless the polynomial itself is degenerate (has a zero leading coefficient).

The characteristic polynomial of a companion matrix is, in fact, the polynomial that was used to define the matrix originally. Thus it is always possible to construct a matrix with any desired set of eigenvalues, by constructing the corresponding characteristic polynomial, and then the companion matrix.

The companion matrix can also be used to perform a decomposition of a matrix A. If x is a vector, and K the Krylov matrix

K = Krylov ( A, x, n )

whose columns are the successive products x, A*x, A²*x, and so on, and if K is nonsingular, then

A = K * C * K^-1

where the matrix C is the companion matrix of A.

(There are several equivalent forms of the companion matrix, with the coefficients running along the top, the bottom, or the first column of the matrix.)

Compatible Norms

A matrix norm and a vector norm are compatible if it is true, for all vectors x and matrices A that

||A*x|| <= ||A|| * ||x||

In some texts, the word consistent is used in this sense, instead of compatible.

In particular, if you have not verified that a pair of norms are compatible, then the above inequality is not guaranteed to hold. For any vector norm, it is possible to define at least one compatible matrix norm, namely, the matrix norm defined by:

||A|| = supremum ||A*x|| / ||x||

where the supremum (roughly, the "maximum") is taken over all nonzero vectors x. If a matrix norm can be derived from a vector norm in this way, it is termed a vector-bound matrix norm. Such a relationship is stronger than is required by compatibility.

If a matrix norm is compatible with some vector norm, then it is also true that

||A*B|| <= ||A|| * ||B||

where both A and B are matrices.

Complex Number Representation

Complex numbers have the form a+bi, where i is a special quantity with the property that i²=-1.

It is possible to devise real matrices that behave like complex numbers. Let the value "1" be represented by the identity matrix of order 2, and the value "i" be represented by

        0  1
       -1  0

Then it is easy to show that these matrices obey the rules of complex numbers. In particular, "i" * "i" = - "1". In general, the complex number a+bi is represented by

        a  b
       -b  a

and multiplication and inversion have the correct properties.

Condition Number

The condition number of the (square) coefficient matrix A of a linear system is a (nonnegative) number used to estimate the amount by which small errors in the right hand side b, or in A itself, can change the solution x.

This analysis ignores arithmetic roundoff, which is hard to analyze, and focusses on easily measurable quantities known beforehand, and how they will amplify or diminish the roundoff errors.

Small values of the condition number suggest that the algorithm will not be sensitive to errors, but large values indicate that small data or arithmetic errors may explode into enormous errors in the answer.

The condition number is defined in terms of a particular matrix norm. Many different matrix norms may be chosen, and the actual value of the condition number will vary depending on the norm chosen. However, the general rule that large condition numbers indicate sensitivity will hold true no matter what norm is chosen.

The condition number for a matrix A is usually defined as

condition ( A ) = || A || * || A^-1 ||.

If A is not invertible, the condition number is infinite.

Simple facts about the condition number:

The condition number is always at least 1;
The condition number of the identity matrix is 1;
The condition number of a singular matrix is infinity;
The condition number (under the L2 norm) of any orthogonal or unitary matrix is 1;
The condition number (under the L2 norm) of a matrix is the ratio of its largest singular value divided by its smallest singular value (including 0).

LINPACK routines such as SGECO return RCOND, an estimate of the reciprocal of the condition number in the L1 matrix norm.

Turing's M condition number, M(A), for a matrix of order N, is defined as

M(A) = N * max | A_i,j | * max | A^-1_i,j |.

Turing's N condition number, N(A) is

N(A) = Frob ( A ) * Frob ( A^-1 ) / N

where Frob(A) is the Frobenius matrix norm.

The Von Neumann and Goldstine P condition number is

P(A) = | lambda_Max / lambda_Min |

where lambda_Max and lambda_Min are the eigenvalues of largest and smallest magnitude, which is equivalent to using the spectral radius of A and A^-1.

There is also a condition number defined for the eigenvalue problem, which attempts to estimate the amount of error to be expected when finding the eigenvalues of a matrix A.

Congruent Matrix

Congruent matrices A and B are related by a nonsingular matrix P such that

A = P' * B * P.

Congruent matrices have the same inertia.

Congruence is of little interest by itself, but the case where P is also an orthogonal matrix is much more important.

Conjugate Gradient Method

The conjugate gradient method is designed to solve linear systems

A * x = b

when the matrix A is symmetric, and positive definite.

The classical method is not an iterative method, but rather a direct method which produces an approximation to the solution after N steps.

Because of numerical inaccuracies and instabilities, many implementations of the method repeat the computation several times, until the residual error is deemed small enough.

On the other hand, if N is very large, and a suitable preconditioner is used, it is possible to treat the conjugate gradient method as "essentially" an iterative method, in the sense that it may be the case that a good approximate solution can be found by stopping the algorithm early. The matrix is assumed to be symmetric positive definite; if it is also large and sparse, a suitable preconditioner may be the Incomplete Cholesky Factorization. In this case, the algorithm is sometimes referred to as ICCG(0): that is, the Incomplete Cholesky - Conjugate Gradient algorithm.

The method is ideally suited for use with large sparse systems, because the matrix A is only accessed to compute a single matrix-vector product on each step. This involves no fill in or overwriting of the data structure that describes A. On the other hand, if the matrix A is dense, the conjugate gradient method costs roughly 3 times the number of operations for direct Gauss elimination.

The conjugate gradient method can be considered as a minimization of the functional f(x), defined by

f(x) = x' * ( 0.5 * A * x - b )

which achieves its minimum value when x solves the linear system.

Here are the formulas for the basic conjugate gradient method. Brackets indicate the value of an iterative quantity. X[0] is the initial value of the vector X, X[1] the value after one iteration, and so on.

        X[0] = 0
        For K = 1 to N

Compute the residual error:

        R[K-1] = B - A * X[K-1]

Compute the direction vector:

        If K = 1 then
          P[K] = R[0]
        else
          BETA = - P'[K-1] * A * R[K-1] 
               / ( P'[K-1] * A * P[K-1] )
          P[K] = R[K-1] + BETA * P[K-1]
        end if

Compute the location of the next iterate:

        ALPHA = ( R'[K-1] ) * R[K-1] / ( P'[K] * A * P[K] )
        X[K] = X[K-1] + ALPHA * P[K]
      end for

Conjugate gradient algorithms are available in IMSL, ITPACK and NSPCG.

Conjugate Matrix

The conjugate matrix of a complex matrix A, denoted by A* or conjugate ( A ), is the matrix obtained by replacing each entry of A by its complex conjugate.

(In this document, the form conjugate ( A ) is preferred, because the A* is easily confused with multiplication.

The complex conjugate transpose, sometimes called the Hermitian or tranjugate of A, is derived from A by complex conjugation, followed by transposition, and is denoted by A^H.

Conjugate of a Complex Number

The conjugate of a complex number z = a + b * i is the complex number

conjugate ( z ) = a - b * i.

The conjugate is frequently represented by placing a bar over the quantity, or occasionally a star after it, as in z*.

The complex conjugate can be used in a formula for the norm or magnitude of a complex number, which must always be a real nonnegative value:

norm ( z ) = sqrt ( z * conjugate ( z ) ) = sqrt ( a² + b² ).

For complex vectors, an inner product with the correct properties may be defined as:

V dot W = ( V, W ) = sum ( 1 <= I <= N ) conjugate ( V(I) ) * W(I).

This inner product is computed in the BLAS function CDOTC, for example, and yields another relationship with the Euclidean vector norm:

|| V || = sqrt ( V dot V )

Conjunctive Matrix

Two (complex) matrices A and B are said to be conjunctive if there is some nonsingular matrix P so that

A = ( conjugate ( P ) )' * B * P,

This is the extension to complex matrices of the concept of a congruent real matrix.

Consistent System

A consistent linear system is an M by N linear system A * x = b, with A and b given, for which there is at least one solution vector x.

A simple example of an inconsistent system is:

        1 0 1        1
        0 2 3 * x =  2
        1 2 4        4

The last row of A is the sum of the previous two rows, but the last entry of b is not the sum of the previous two entries, so there can be no solution.

By contrast, the system

        1  2  3         14
        4  5  6  * x  = 34

is consistent, having as one solution (of many) the vector (1,2,3), and the system

        1  0          5
        1  2          5
        2  4  * x =  10
        3  6         15
        4  8         20

is consistent, having the unique solution ( 5, 0 )

Facts about a consistent linear system:

If the matrix A is square, and invertible, then any linear system involving A will be consistent.
If the matrix A is square, and noninvertible, then there is guaranteed to be some right hand side b for which the corresponding linear system will be inconsistent.
For a square or rectangular matrix A, the linear system will be consistent if and only if the right hand side b lies in the range of A which is equivalent to requiring that b lie in the column space of A.
A rectangular system of any order can be consistent or inconsistent.

If a linear system is not consistent, then there is no solution x. But in such a case, it may be of interest to determine an approximate solution, which satisfies some condition, such as minimizing some norm of the residual. One technique for doing this involves the linear least squares method..

Convergent Matrix

A convergent matrix A is a square matrix for which the limit as n goes to infinity of Aⁿ is zero.

A matrix is convergent if and only if the spectral radius rho(A) satisfies

rho(A) < 1

A semiconvergent matrix A is a square matrix A for which the limit as n goes to infinity of Aⁿ exists. If a matrix is semiconvergent, then either it is convergent, (rho(A) < 1) or else rho(A) = 1. In the second case, it must be further true that the only eigenvalue of norm 1 is 1, and that this eigenvalue is semisimple.

The simplest example of a semiconvergent matrix is the identity matrix.

Correlation Matrix

A correlation matrix is a square positive semidefinite symmetric matrix with unit diagonal entries, and off-diagonal entries whose magnitude is no greater than 1.

In statistical applications, entry A_i,j represents the strength of the correlation between variables I and J. The strongest correlation is +1; a correlation of 0 means the variables are apparently completely independent. If a correlation is negative, it means that variable I is positively correlated to the negative of the value of variable J.

The entries of the correlation matrix can be represented as

A_i,j = X(I) dot X(J) / ( ||X(I)|| ||X(J)|| )

where the norm used is derived from the dot product, and the dot product takes the form appropriate for the kind of data being handled, such as finite vectors, or continuous functions.

Cross Product

The cross product of two vectors u and v, denoted u x v, is a vector w which is perpendicular to u and v, pointing in the direction so that (u,v,w) forms a right handed coordinate system, and whose length is equal to the area of the parallelogram two of whose sides are u and v.

Algebraically,

w(1) = u(2) * v(3) - u(3) * v(2)
w(2) = u(3) * v(1) - u(1) * v(3)
w(3) = u(1) * v(2) - u(2) * v(1)

If the unit vectors in the coordinate directions are denoted by i, j and k, then the cross product vector can also be regarded as the (vector) value of the following "determinant":

                        |   i    j    k  |
        w = u x v = det | u(1) u(2) u(3) |
                        | v(1) v(2) v(3) |

Cyclic Reduction

Cyclic reduction is a method for solving a linear system A*x=b in the special case where A is a tridiagonal matrix.

On a parallel computer, this method solves the system in LOG(N) "steps" where N is the order of A. A standard Gauss elimination method for a tridiagonal system would require roughly N "steps" instead.

A tridiagonal system has some very special properties that will allow us to carry this operation out. Consider this system of 7 equations:

        A11 x1 + A12 x2                                              = y1
        A21 x1 + A22 x2 + A23 x3                                     = y2
                 A32 x2 + A33 x3 + A34 x4                            = y3
                          A43 x3 + A44 x4 + A43 x5                   = y4
                                   A54 x4 + A55 x5 + A56 x6          = y5
                                            A65 x5 + A66 x6 + A67 x7 = y6
                                                     A76 x6 + A77 x7 = y7

The first equation can be used to eliminate the coefficient A21 in the second equation, and the third equation to eliminate the coefficient A23 in the second equation. This knocks out variables x1 and x3 in the second equation, but adds x4 into that equation.

By the same method, x3 and x5 can be eliminated from the equation for x4, and so on. By eliminating the odd variables from the even equations, a smaller tridiagonal system system is derived, with half the equations and variables.

If elimination is applied to this set, the number of equations is again reduced by half; this reduction may be repeated until a single equation in one variable is reached. Backsubstitution then produces the values of all the variables.

The reason this method might have an advantage over Gauss elimination is that, at each step of the elimination phase, the parts of the step are independent. If many computer processors are available, then each can be working on a separate portion of the elimination. If the number of processors is large enough, the system can really be solved in LOG(N) time.

Cyclic reduction routines are available in the NCAR software library, the SLATEC library, and the Cray SCILIB library.

Cyclic Tridiagonal Matrix

A cyclic tridiagonal matrix is a generalization of a tridiagonal matrix which includes an extra last entry in the first row, and an extra first entry in the last row.

An example of a cyclic tridiagonal matrix:

        -2  1  0  0  1
         1 -2  1  0  0
         0  1 -2  1  0
         0  0  1 -2  1
         1  0  0  1 -2

A cyclic tridiagonal matrix is not a tridiagonal matrix. If the matrix is constant along the three generalized diagonals, a cyclic tridiagonal matrix is a circulant matrix. A cyclic tridiagonal matrix can arise in situations where a periodic boundary condition is applied.

It is very disappointing that a cyclic tridiagonal matrix is not a tridiagonal matrix, since there are so many good methods for solving tridiagonal linear systems. One way to solve a cyclic tridiagonal system is to use the Sherman Morrison Formula and view the matrix as a rank one perturbation of a tridiagonal matrix. Another approach is to view it as a border banded matrix.

A cyclic tridiagonal matrix may also be called a periodic tridiagonal matrix.

Defective Matrix

A defective matrix is a (square) matrix that does not have a full set of N linearly independent eigenvectors.

For every eigenvalue, there is always at least one eigenvector, and eigenvectors corresponding to distinct eigenvalues are linearly independent. If the N eigenvalues of a matrix are distinct, then it surely has N linearly independent eigenvectors, and so cannot be defective.

Conversely, if a matrix is defective, then it must have at least one repeated eigenvalue, that is, an eigenvalue of algebraic multiplicity greater than 1. A matrix is defective if and only if its Jordan Canonical Form has at least one nonzero entry on the superdiagonal.

Thus, a simple example of a defective matrix is:

        1 1
        0 1

which has the single eigenvalue of 1, with algebraic multiplicity 2, but geometric multiplicity 1. The only eigenvector is ( 0, 1 ).

If a matrix is not defective, then its eigenvectors form a basis for the entire linear space. In other words, any vector y can be written as

y = X * c

where X is the array of eigenvectors of A.

If a matrix A is not defective, then it is similar to its diagonal eigenvalue matrix:

A = X * LAMBDA * X^-1

and the similarity transformation matrix X is actually the eigenvector matrix. Thus a nondefective matrix is diagonalizable.

This in turn allows us to make interesting statements about the inverse, transpose, and powers of A. For instance, we see that

A² = ( X * LAMBDA * X^-1 ) * ( X * LAMBDA * X^-1 )
= X * LAMBDA² * X^-1

leading us to the statement that for a nondefective matrix, the square has the same eigenvectors, and the square of the eigenvalues.

Deflation

Deflation is a technique for "removing" a known eigenvalue from a matrix, in order to facilitate the determination of other eigenvalues.

For example, the power method is able to estimate the eigenvalue of largest modulus of a matrix A. Once this is computed, it might be desired to find the next largest eigenvalue. Deflation can be used to essentially create a new matrix, A', which has the same eigenvalues as A, except that the largest eigenvalue has been dropped (and the order of A' reduced by 1) or the largest eigenvalue is replaced by 0. In either case, the power method applied to A' will produce the next largest eigenvalue of A.

To eliminate a known eigenvalue, lambda, it is necessary to know its eigenvector x, which we will assume has been scaled to have unit Euclidean norm. By the properties of eigenvectors, we know that

A * x = lambda * x.

Now define the matrix A^ so that:

A^ = A - lambda * x * x'.

Now x is an eigenvector of A^ with eigenvalue 0, because:

A^ * x = ( A - lambda * x * x' ) * x
= A * x - lambda * x * x' * x
= lambda * x - lambda * x
= 0

If the power method is being employed, then the new iteration should try to "factor out" any component of the eigenvector x; otherwise, small errors in the computation of the first eigenvalue and eigenvector will interfere with the next results.

Theoretically, this process may be repeated as often as desired, eliminating each eigenvalue as it is discovered. Practically, however, accumulated errors in the eigenvalues and eigenvectors make the computation more and more unreliable with each step of deflation. Thus, if more than a few eigenvalues are desired, it is more appropriate to use a standard technique.

Derogatory Matrix

A derogatory matrix is a matrix whose minimal polynomial is of lower degree than its characteristic polynomial

For this to happen, of course, the matrix must have at least one eigenvector with algebraic multiplicity greater than one.

Surprisingly, the identity matrix is derogatory, (ignoring the case N=1). A matrix can be guaranteed to be nonderogatory if the geometric multiplicity of every eigenvalue is 1.

Perhaps the only reason that the term is worth knowing is this fact: every nonderogatory matrix is similar to the companion matrix of its characteristic polynomial.

Determinant of a Matrix

The determinant of a square matrix is a scalar value whose most common usage is to indicate whether a matrix is singular or not.

If a matrix is singular, then it doesn't have an inverse and linear systems involving this matrix cannot be reliably solved.

In numerical work, the determinant is not really a reliable indicator of singularity, and other data, such as the relative size of the matrix elements encountered during pivoting, are preferred when checking for singularity.

The determinant also occurs in the definition of the eigenvalue problem.

An explicit formula for the determinant of a matrix A is:

det ( A ) = sum [ over all P ] sign(P) * A(1,P(1)) * A(2,P(2) * ... * A(N,P(N)).

where the sum ranges over all possible permutations P of the numbers 1 through N, and sign(P) is +1 for an even permutation, and -1 for an odd permutation. (Any permutation may be accomplished by a sequence of switching pairs of objects. The permutation is called even or odd, depending on whether the number of switches is even or odd).

A numerical method for finding the determinant comes as a byproduct of the LU factorization used in Gaussian elimination. Typically, this factorization has the form

A = P * L * U,

and the value of the determinant is simply

det ( A ) = det ( P ) * product ( 1 <= I <= N ) U(I,I).

where det ( P ) is +1 or -1, again determined by the sign of the permutation.

Simple facts about the determinant:

the determinant of the identity matrix is 1;
the determinant of a permutation matrix is +1 or -1;
the determinant of an orthogonal matrix is +1 or -1;
the determinant of a diagonal matrix is the product of the diagonal entries;
the determinant of a unit upper or lower triangular matrix is 1;
the determinant of an upper or lower triangular matrix is the product of the diagonal entries;
there is a formula for the determinant of the identity matrix plus a rank one matrix: det ( I + uv' ) = 1 + u'v;
the product of the eigenvalues of a matrix equals the determinant;
the product of the singular values of a matrix (from the singular value decomposition) equals the absolute value of the determinant;
det ( A' ) = det ( A );
det ( s*A ) = s^N*det ( A ), where s is a scalar multiplier, and N is the order of A;
det ( A * B ) = det ( A ) * det ( B );
det ( A^-1 ) = 1 / det ( A ).

A single elementary row operation has the following effect on the determinant:

Interchanging two rows multiplies the determinant by -1;
Multiplying a row by the nonzero scalar s multiplies the determinant by s;
Adding a multiple of one row to another leaves the determinant unchanged.

For small matrices, the exact determinant is simple to compute by hand. The determinant of a 2 by 2 matrix

        a  b
        c  d

is a*d-b*c, while the determinant of a 3 by 3 matrix:

        a  b  c
        e  f  g
        h  i  j

is a * (f*j-g*i) - b * (e*j-g*h) + c * (e*i-f*h).

If absolutely necessary, the determinant of a matrix of order N can be computed recursively in terms of determinants of minor matrices. Let M(A,I,J) stand for the (I,J) minor matrix of A. Then the determinant of A is

det ( A ) = sum ( 1 <= J <= N ) (-1)^(J+1) * A_i,j * det ( M(A,I,J) ).

Of course, now we need to compute the determinants of the N minor matrices, but the order of these matrices has been reduced by 1. Theoretically, we can represent the determinant of any of these matrices of order N-1 by a similar sum involving minor matrices of order N-2, and this process can be repeated until we reach matrices of order 1 or 2, whose determinants are easy to compute. In practice, this method is never used except in simple classroom exercises.

There is a geometric interpretation of the determinant. If the rows or columns of A are regarded as vectors in N dimensional space, then the determinant is the volume of a the parallelepiped, or "slanted cube" whose one corner is defined by these vectors.

LAPACK and LINPACK provide routines for computing the determinant of a matrix, after the matrix has been decomposed into LU factors.

Diagonal Dominance

A matrix is (column) diagonally dominant if, for every column, the sum of the absolute values of the offdiagonal elements is never greater than the absolute value of the diagonal element.

The matrix is strictly diagonally dominant if the offdiagonal sum is always strictly less than the absolute value of the diagonal element.

The same definitions can be used to consider rows instead of columns. The terms column diagonally dominant and row diagonally dominant may be used, if necessary, to specify which case is being considered.

A strictly diagonally dominant matrix cannot be singular, by Gershgorin's Theorem.

A diagonally dominant matrix which is also irreducible, which is referred to as a irreducibly diagonally dominant matrix, cannot be singular.

Here is a diagonally dominant matrix which is not strictly diagonally dominant:

         2 -1  0  0  0
        -1  2 -1  0  0
         0 -1  2 -1  0
         0  0 -1  2 -1
         0  0  0 -1  2

For most of the rows, the sum of the absolute values of the off-diagonal entries is equal to the absolute value of the diagonal entry. This matrix is irreducible, so it cannot be singular.

For a linear system A * x = b, if the matrix A is strictly diagonally dominant, then both Jacobi iteration and Gauss Seidel iteration are guaranteed to converge to the solution.

Diagonal Matrix

A diagonal matrix is one whose only nonzero entries are along the main diagonal. For example:

        3 0 0
        0 4 0
        0 0 7

Simple facts about a diagonal matrix A:

A is singular if and only if any diagonal entry is zero;
The eigenvalues of A are the diagonal entries, and the eigenvectors are the columns of the identity matrix;
The determinant of A is the product of the diagonal entries;

Diagonalizable Matrix

A diagonalizable matrix is any (square) matrix A which is similar to a diagonal matrix:

A = P * D * P^-1.

This concept is important in the study of eigenvectors To see the relationship, post-multiply the equation by P:

A * P = P * D.

Looking at the columns of P as eigenvectors, and the diagonal entries of D as eigenvalues, this shows that a matrix is diagonalizable exactly when it has N linearly independent eigenvectors.

In certain cases, not only is a matrix diagonalizable, but the matrix P has a special form. The most interesting case is that of any (real) symmetric matrix A; not only can such a matrix be diagonalized, but the similarity matrix is orthogonal:

A = Q * D * Q^-1 = Q * D * Q',

This fact can be interpreted to show that not only does every symmetric matrix have a complete set of eigenvectors, but the eigenvectors and eigenvalues are real, and the eigenvectors are pairwise orthogonal.

Similarly, a complex matrix that is a Hermitian has a complete set of eigenvectors, and the eigenvalues are real. (The eigenvectors in this case will not generally be real).

Simple facts:

If the eigenvalues of A are distinct, then it can be diagonalized;
If a matrix is symmetric, then it is diagonalizable, and the transformation matrix can be chosen to be orthogonal.
A matrix is diagonalizable if, and only if, it has a set of N linearly independent eigenvectors; in other words, if it is not defective.
If a matrix is normal, then it is not only diagonalizable, but the transformation matrix is unitary.

Downshift Matrix

The downshift matrix A circularly shifts all vector entries or matrix rows down 1 position.

Example:

Facts about the downshift matrix A:

A is a permutation matrix;
A is an N-th root of the identity matrix;
A is persymmetric;
A is a circulant matrix;
any circulant matrix, generated by a column vector v, can be regarded as the Krylov matrix (v,A*v,A²*v,...,A^n-1*v.
the inverse of the downshift matrix is the upshift matrix.

Eigenvalues

Eigenvalues are special values associated with a (square) matrix, which can be used to analyze its behavior in multiplying any vector.

The formal definition of an eigenvalue of a matrix A is that it is any value lambda which is a root of the characteristic equation of the matrix,

det ( A - lambda * I ) = 0.

lambda is an eigenvalue of A if and only if there is a nonzero vector x, known as an eigenvector (or sometimes a "right" eigenvector), with the property that

A * x = lambda * x.

Note that there must also be a "left" eigenvector y, with the property

y * A = A' * y = lambda * y.

The characteristic equation has exactly N roots, so a matrix has N eigenvalues. An important consideration is whether any eigenvalue is a repeated root, which determines how hard the eigenvector computation will be.

If a matrix has the maximum possible number of linearly independent eigenvectors (namely N, the order of the matrix), then the eigenvalues and eigenvectors can be used to diagonalize the matrix. This only happens when the matrix is normal.

If the NxN matrix A has N linearly independent eigenvectors, we may collect the eigenvectors as columns of a matrix X, the eigenvalues as entries in the diagonal matrix Lambda, we may first write

A * X = X * Lambda

and then, because the eigenvectors are presumed to be linearly independent, we have the eigen decomposition or the spectral decomposition of A:

A = X * Lambda * X^-1

Simple facts about eigenvalues of A:

A is singular if and only if 0 is an eigenvalue of A.
If A is symmetric, all eigenvalues are real.
If A is positive definite symmetric matrix, all eigenvalues are real and positive.

Simple algorithms for computing eigenvalues include the power method and the inverse power method. The QR method is a more powerful method that can handle complex and multiple eigenvalues.

LAPACK and EISPACK include algorithms for computing the eigenvalues and eigenvectors of a variety of types of matrix, as well as methods that can be applied to more general eigenvalue problems.

Eigenvectors

A nonzero vector x is an eigenvector of the square matrix A if

A * x = lambda * x

for some scalar value lambda, called the associated eigenvalue.

Sometimes this eigenvector is more particularly described as a right eigenvector, so that we may also consider left eigenvectors, that is, vectors y for which it is true that

y * A = A' * y = mu * y

for some scalar mu.

For every eigenvalue of a matrix, there is at least one eigenvector. Every nonzero multiple of this eigenvector is also an eigenvector, but in an uninteresting way. If, and only if, an eigenvalue is a repeated root, then there may be more than one linearly independent eigenvector associated with that eigenvalue. In particular, if an eigenvalue is repeated 3 times, then there will be 1, 2 or 3 linearly independent eigenvectors corresponding to that eigenvalue.

Facts about eigenvectors:

If x is an eigenvector, so is s*x, for any nonzero scalar s.
If A is singular, then it has an eigenvector associated with the eigenvalue 0, so that A * x = 0.
If x is a right eigenvector for eigenvalue lambda, and y is a left eigenvector for eigenvalue mu, and lambda and mu are distinct, then x and y are orthogonal. This property is sometimes call biorthogonality.

EISPACK

EISPACK is a package of routines for handling the standard and generalized eigenvalue problems.

The beginning user who is not interested in trying to learn the details of EISPACK and simply wants the answer to an eigenvalue problem quickly should call one of the main driver routines. Each of these is tailored to handle a given problem completely with a single subroutine call. For more advanced work, it may be worth investigating some of the underlying routines.

Driver routines to solve A*x=lambda*x include:

CG, complex general matrix.
CH, complex Hermitian matrix.
RG, real general matrix.
RS, real symmetric matrix.
RSB, real symmetric band matrix.
RSP, real symmetric, packed storage matrix.
RSPP, real symmetric packed storage matrix, some eigenvectors.
RST, real symmetric tridiagonal matrix.
RT, real sign-symmetric tridiagonal.

For the generalized eigenvalue problem:

RGG solves A*x=lambda*B*x for A and B real, general matrices.
RSG solves A*x=lambda*B*x for A real symmetric, B real positive definite.
RSGAB solves A*B*x=lambda*x for A real symmetric, B real positive definite.
RSGBA solves B*A*x=lambda*x for A real symmetric, B real positive definite.

EISPACK Matrix Norm

The EISPACK matrix norm is used in the EISPACK eigenvalue package.

The definition of the norm for an M by N matrix is:

||A|| = sum ( I = 1 to M, J = 1 to N ) | A_i,j |

It's a simple exercise to verify that this quantity satisifes the requirements for a matrix norm.

This norm is easy to calculate, and was used in EISPACK in order to have a standard against which to compare the size of matrix elements that were being driven to zero. I haven't seen it used anywhere else in practice.

Elementary Column Operations

Elementary column operations are a simple set of matrix operations that can be used to carry out Gauss elimination, Gauss Jordan elimination, or the reduction of a matrix to column echelon form.

Restricting the operations to a simple set makes it easy to:

guarantee that each step is legitimate;
record each step using a simple notation;
compute the inverse of the total set of operations.

The three elementary column operations include:

interchange any two columns;
multiply any column by a nonzero value;
add a multiple of any column to another column.

Each of these operations may be represented by an elementary matrix, and the transformation of the original matrix A to the reduced matrix B can be expressed as postmultiplication by a concatenation of elementary matrices:

B = A * E(1) * E(2) * ... * E(k)

which may be abbreviated as:

B = A * C

Since C will be guaranteed to be invertible, we also know that,

B * C^-1 = A

which yields a factorization of A.

Elementary Matrix

An elementary matrix E is one which, when pre-multiplying another matrix A, produces a product matrix E * A which has exactly one of the following properties:

rows R1 and R2 are interchanged, or
row R1 is multiplied by a nonzero constant s, or
a multiple of row R2 is added to row R1.

The matrix E which interchanges rows R1 and R2 of matrix A has the form E(I,J)=:

1, for I = R1, J = R2, or I = R2, J = R1;
1, for I = J and I not equal to R1 or R2;
0, otherwise.

The inverse of this matrix is simply its transpose

The matrix E which multiplies row R1 of A by the constant s has the form E(I,J)=:

s, if I = J = R1;
1, if I = J =/= R1;
0, otherwise.

The inverse of this matrix is constructed by negating the value of s.

The matrix E which adds s * row R2 to row R1 of A has the form E(I,J) =:

1, if I = J;
s, if I = R1 and J = R2;
0, otherwise.

The inverse of this matrix is constructed in the same way, using 1/s.

If a matrix F can be represented as the product of elementary matrices,

F = E1 * E2 * ... * EM,

then its inverse is:

F^-1 = EM^-1 * EM-1^-1 * ... * E1^-1.

An elementary similarity transformation uses a matrix F which is the product of elementary matrices, and transforms the matrix A into the similar matrix B by the formula

B = F^-1 * A * F.

Elementary Row Operations

Elementary row operations are a simple set of matrix operations that can be used to carry out Gauss elimination, Gauss Jordan elimination, or the reduction of a matrix to row echelon form.

Restricting the operations to a simple set makes it easy to:

guarantee that each step is legitimate;
record each step using a simple notation;
compute the inverse of the total set of operations.

The three elementary row operations include:

interchange any two rows;
multiply any row by a nonzero value;
add a multiple of any row to another row.

Each of these operations may be represented by an elementary matrix, and the transformation of the original matrix A to the reduced matrix B can be expressed as premultiplication by a concatenation of elementary matrices:

B = E(k) * E(k-1) * ... * E(2) * E(1) * A

which may be abbreviated as:

B = C * A.

Since C will be guaranteed to be invertible, we also know that,

C^-1 * B = A

which yields a factorization of A.

Ellipsoids

An ellipsoid is an N dimensional generalization of an ellipse. The formula for an ellipsoid may be written as:

sum ( I = 1 to N, J = 1 to N ) A_i,j * X_i * X_j = 1.

where A is a positive definite symmetric matrix.

A principal axis of an ellipsoid is any N dimensional point X on the ellipsoid such that the vector from the origin to X is normal to the ellipsoid.

In the general case, there are exactly N principal axes (plus their negatives). In degenerate cases, there may be an entire plane of vectors that satisfy the requirement, but it is always possible to choose a set of N principal axes which are linearly independent.

Moreover, in the general case, the principal axes are pairwise orthogonal vectors, and in the degenerate case, may be chosen pairwise orthogonal.

Moreover, it is always true that the principal axes are eigenvectors of the matrix A of ellipsoid coefficients. The length of the principal axis vector associated with an eigenvalue lambda(I) is 1 / Sqrt ( lambda(I) ).

These facts have a strong relationship to the formulation of the conjugate gradient method.

Equilibration

Equilibration is the technique of balancing the rows or columns of a matrix by rescaling them.

Consider, for instance, the fact that the following two equations are equivalent:

0.0001 * x + 0.0001 * y = 0.0001

and

1000 * x + 1000 * y = 1000

However, the large coefficients in the second equation will bias a Gauss elimination routine to choose that equation as its pivot. Actually, it's more important in this case that the chosen row be as "linearly independent as possible" from the other rows, and this is more likely to occur if we ensure that all the rows start out with an equal norm. This can be done very simply, by finding the element of maximum absolute value in each row and dividing that row (and its right hand side) by that value. Such a technique is called row equilibration. It is not necessary that the rows have precisely the same norm; it is desirable that the norms of the rows be maintained within some controlled range.

Equilibration is useful in many areas of linear algebra, including eigenvalue calculations. In some cases, column equilibration is preferred, and in other cases, the norms of both the rows and columns are to be controlled.

Equivalent Matrix

Matrices A and B are said to be equivalent if there are nonsingular matrices P and Q so that

A = P * B * Q.

Simple facts about equivalence:

Every matrix is equivalent to a diagonal matrix;
If A and B are equivalent, then they are both singular or both nonsingular.

Equivalence is a very loose concept of relatedness. A stronger and more useful concept is similarity.

Exchange Matrix

The exchange matrix J is constructed from the identity matrix by reversing the order of the columns.

For example, the matrix J of order 4:

Facts about the exchange matrix J:

J is a square root of the identity matrix;
J is a permutation matrix.
J is a symmetric matrix.
J is persymmetric.
J is a Hankel matrix.
J is an anticirculant matrix.
The vector J*x comprises the entries of x in reverse order.
For any matrix A, the matrix J*A has the same rows but in reversed order.
For any matrix A, the matrix A*J has the same columns but in reversed order.
If H is a Hankel matrix, then H*J and J*H are each a .Toeplitz matrix.
If T is a Toeplitz matrix, then T*J and J*T are each a .Hankel matrix.

The exchange matrix is sometimes called the anti-identity matrix or the counter-identity matrix or the reversal matrix.

External Storage Algorithms

An external storage algorithm is a method of solving a problem that is too large to be loaded into computer memory as a whole.

Instead, the problem is solved incrementally, with most of the problem data residing, at any one time, in computer files, also called "disk storage" or "external storage". It is a considerable difficulty just to rewrite an algorithm that can handle a situation where, say, part of a matrix is in one place, and part is in another, remote place.

However, such algorithms must also be aware that data transfers between memory and disk are very slow. Hence, if the algorithm is to be of any use, it must do as much processing as possible on the portion of the data that resides in memory, and read the external problem data into memory as rarely as possible, and in large contiguous "chunks".

Fourier Matrix

The Fourier matrix represents the linear operator that transforms a vector of data into a vector of Fourier coefficients.

Let w indicate an N-th root of unity. Then, if we choose N=4, the matrix F will be:

        1  1    1    1
        1  w    w^2  w^3
        1  w^2  w^4  w^6
        1  w^3  w^6  w^9

which simplifies to

        1  1    1    1
        1  w    w^2  w^3
        1  w^2  1    w^2
        1  w^3  w^2  w^1

However, we will choose to scale F by 1/sqrt(N).

Facts about the Fourier matrix F:

F is symmetric.
F is unitary;
F is a fourth-root of the identity matrix.
the only eigenvalues of F are 1, -1, i and -i.
F diagonalizes any circulant matrix.
F is a special kind of Vandermonde matrix.

Frobenius Matrix Norm

The Frobenius matrix norm is a matrix norm that has the simple formula: ||A||_F = the square root of the sum of the squares of all the entries of the matrix.

The Frobenius matrix norm is not a vector-bound matrix norm, although it is compatible with the L2 vector norm, and much easier to compute that the L2 matrix norm.

Thus, we can use the L2 norm for vectors, the Frobenius norm for matrices, and write:

||A*x||₂ <= ||A||_F * ||x||₂

although using the L2 matrix norm would in general give us a tighter estimate.

The Frobenius matrix norm is equal to the square root of the sum of the squares of the singular values of the matrix.

The Frobenius matrix norm is sometimes called the Schur matrix norm or Euclidean matrix norm

To see why the Frobenius matrix norm is not vector-bound, note that the Frobenius norm of the identity matrix of order n is the square root of n, but a matrix bound norm of the identify matrix must be 1.

Gauss Elimination

Gauss elimination has the goal of producing a solution x to the system of linear equations A*x=b, where A is matrix of order N, and b a vector of length N. The standard version of Gauss elimination used in most algorithms employs partial pivoting.

Gauss elimination accomplishes its goal by decomposing the original matrix A into three factors, a permutation matrix P, a unit lower triangular matrix L, and an upper triangular matrix U. The factors are related to the original matrix by the formula

A = P * L * U.

Once the matrix is factored, it is a simple matter to solve A*x=b, by solving instead P * ( L * ( U * x ) ) ) = b, because each of the three factors is easy to invert.

Moreover, once the factors are known, the user may solve several linear systems involving A, with different right hand sides.

The determinant of A is equal to the product of the determinants of the factors, and hence is easily computed: the determinant of P is plus or minus 1, and that of L is 1, and that of U is simply the product of its diagonal elements.

The inverse matrix could be solved for, if necessary, by solving the N linear systems A * X(I) = E(I), where E(I) is the I-th Cartesian basis vector The vectors X(1), X(2), ..., X(N) then are the columns of the inverse of A.

As an example of the Gauss elimination of a matrix, suppose we start with with the the matrix:

We can write an imperfect PLU factorization as:

            P          L          U
    
          1 0 0      1 0 0      1 2 3
          0 1 0      0 1 0      4 5 6
          0 0 1      0 0 1      7 8 0

The factorization is imperfect because, although A = P*L*U, the matrix U is not upper triangular. We will now modify the matrix U, and update the factors P and L, so that it is always true that A=P*L*U, while the matrix U gradually is transformed into the correct upper triangular form.

Step 1.1: Choose a pivot row in U, namely, row 3. We want to interchange rows 3 and 1 of the matrix U. The elementary permutation matrix P(1,3) does this. We are allowed to insert the inverse of this matrix times itself between L and U in the factorization. We also insert the inverse of this matrix times itself between P and L. If we use primes to denote the updated quantities, these operations are:

        
          A = P * L * U
            = [ P * P^-1(1,3) ] * [ P(1,3) * L * P^-1(1,3) ] * [ P(1,3) * U ]
            = P' * L' * U'