Linear Maps - I

An important technique to study structured sets is to study functions between such sets that preserve their structure. In the current context, the structure inherent to vector spaces is linearity. Maps between vector spaces that preserve this linear structure, called linear maps are studied now. Throughout this section, $V$ and $W$ represent finite dimensional inner product spaces of dimension $n$ and $m$ , respectively.

Basic definitions

Let us first consider the case when $V$ and $W$ are real vector spaces, not necessarily endowed with an inner product. We call a map of the form $\mathsf{T}:V \to W$ a vector space homomorphism, or more simply a linear map, if, for any $\mathsf{u}, \mathsf{v} \in V$ , and for any $a, b \in \mathbb{R}$ , $\mathsf{T}(a\mathsf{u} + b\mathsf{v}) = a\mathsf{T}(\mathsf{u}) + b \mathsf{T}(\mathsf{v}).$ It is conventional to write $\mathsf{T}(\mathsf{v})$ as just $\mathsf{T}\mathsf{v}$ when $\mathsf{T}$ is a linear map. Notice how linear maps preserve the linear structure: the action of a linear map on a linear combination of vectors is the linear combination of the action of the linear map on the individual vectors.

Remark

Note that, in the definition above, the vector addition and scalar multiplication in the term $(a\mathsf{u} + b\mathsf{v})$ are those defined in $V$ , while the ones in the term $a\mathsf{T}(\mathsf{u}) + b\mathsf{T}(\mathsf{v})$ are those in $W$ . The same notation is used just to keep the notation simple.

Example

Consider the map $\mathsf\pi:\mathbb{R}^3 \to \mathbb{R}^2$ defined as follows: for any $(x_1, x_2, x_3) \in \mathbb{R}^3$ , $\mathsf\pi(x_1, x_2, x_3) = (x_1, x_2).$ It is easy to verify that $\mathsf\pi:\mathbb{R}^3 \to \mathbb{R}^2$ is a linear map.

We will now present a few definitions associated with linear maps that are useful in practice. The kernel of the linear map $\mathsf{T}:V \to W$ is the set of all elements in $V$ that are mapped to $\mathsf{0} \in W$ by $\mathsf{T}$ : $\text{ker}(\mathsf{T}) = \{\mathsf{v}\in V \,|\, \mathsf{T}\mathsf{v} = \mathsf{0}\}.$ $\text{ker}(\mathsf{T})$ is also called the null space of $\mathsf{T}$ , and is a linear subspace of $V$ . Note that the kernel $\mathsf{T}$ is always non-empty since $\mathsf{0} \in \text{ker}(T)$ .

The image of the linear map $\mathsf{T}:V \to W$ between two vector spaces $V$ and $W$ , written as $\text{img}(\mathsf{T})$ is defined as the set of all those elements in $W$ which are obtained by the action of $\mathsf{T}$ on some element of $V$ : $\text{img}(\mathsf{T}) = \{\mathsf{w} \in W \,|\,\text{for some }\mathsf{v} \in V,\,\mathsf{T}\mathsf{v} = \mathsf{w}\}.$ The image of a linear map is a linear subspace of its codomain. The dimension of the image of the linear map $\mathsf{T}:V \to W$ is called the rank of $\mathsf{T}$ .

The following result holds for all linear maps $\mathsf{T}:V \to W$ between finite dimensional vector spaces: $\text{dim}(V) = \text{dim}(\text{ker}(\mathsf{T})) + \text{dim}(\text{img}(\mathsf{T})).$ This is known as the rank-nullity theorem.

Example

Consider the map $\pi:\mathbb{R}^3 \to \mathbb{R}^2$ that was defined earlier as $\mathsf\pi(x_1, x_2, x_3) = (x_1, x_2)$ for any $(x_1, x_2, x_3) \in \mathbb{R}^3$ . In this case the null space of $\pi$ is seen to be $\text{ker}(\mathsf\pi) = \{(0,0,a)\,|\,a \in \mathbb{R}\}.$ Note that $\text{dim}(\text{ker}(\mathsf\pi)) = 1$ . The range of $\mathsf\pi$ is easily seen to be the whole of $\mathbb{R}^2$ . We therefore obtain the relation $\text{dim}(\text{img}(\mathsf\pi)) = 2$ . From these results, we see that $\text{dim}(\mathbb{R}^3) = \text{dim}(\text{ker}(\mathsf\pi)) + \text{dim}(\text{ker}(\mathsf\pi)) = 3,$ thereby verifying the rank-nullity theorem.

If a linear map $\mathsf{T}:V \to W$ is a bijection, then $\mathsf{T}$ is called a vector space isomorphism, or just an isomorphism. In this case $V$ and $W$ are said to be isomorphic - written as $V \cong W$ .

Example

The map $\mathsf{S}:\mathbb{R}^2 \to \mathbb{R}^2$ defined as $\mathsf{S}(x_1, x_2) = (x_2, x_1),$ for any $(x_1, x_2) \in \mathbb{R}^2$ , is easily seen to be an isomorphism on $\mathbb{R}^2$ .

Example

Consider the set $\mathcal{P}_2(\mathbb{R},\mathbb{R})$ of all real-valued polynomials of one real variable of degree less than or equal to two: $\mathcal{P}_2(\mathbb{R},\mathbb{R}) = \{p:\mathbb{R}\to\mathbb{R}\,|\,\text{for any }x \in \mathbb{R},\;p(x) = a_0 + a_1 x + a_2 x^2 \text{ for some }a_0, a_1, a_2 \in \mathbb{R}\}.$ It is easy to check that $\mathcal{P}_2(\mathbb{R},\mathbb{R})$ is a real vector space with addition and scalar multiplication defined as follows: given $p(x) = a_0 + a_1x + a_2x^2$ , $q(x) = b_0 + b_1x + b_2x^2$ , and $c \in \mathbb{R}$ , $(p + c\cdot q) \in P_2(\mathbb{R},\mathbb{R})$ is defined as $(p + cq)(x) = p(x) + c\cdot q(x) = (a_0 + b_0) + (a_1 + b_1)x + (a_2 + b_2)x^2.$ Consider now the following map $\iota:\mathcal{P}_2(\mathbb{R},\mathbb{R}) \to \mathbb{R}^3$ defined as follows: for any $p \in \mathcal{P}_2(\mathbb{R},\mathbb{R})$ such that for any $x \in \mathbb{R}$ , $p(x) = a_0 + a_1x + a_2x^2$ , $\iota(p) = (a_0, a_1, a_2).$ It is easy to check that $\iota:\mathcal{P}_2(\mathbb{R},\mathbb{R}) \to \mathbb{R}^3$ is indeed an isomorphism.

Finally, if $\mathsf{T}:V \to W$ is a vector space isomorphism, then $\mathsf{T}^{-1}:W \to V$ is also a linear map. To show this note that for any $\mathsf{u}, \mathsf{v} \in W$ , and $a, b \in \mathbb{R}$ , $\begin{split} \mathsf{T}^{-1}(a\mathsf{u} + b\mathsf{v}) &= \mathsf{T}^{-1}(a\mathsf{T}\tilde{\mathsf{u}} + b\mathsf{T}\tilde{\mathsf{v}})\\ &= \mathsf{T}^{-1}(\mathsf{T}(a\tilde{\mathsf{u}} + b\tilde{\mathsf{v}}))\\ &= a\tilde{\mathsf{u}} + b\tilde{\mathsf{v}}\\ &= a \mathsf{T}^{-1}\mathsf{u} + b \mathsf{T}^{-1}\mathsf{v}. \end{split}$ Here, $\tilde{\mathsf{u}}, \tilde{\mathsf{v}} \in V$ are the unique elements of $V$ such that $\mathsf{T}\tilde{\mathsf{u}} = \mathsf{u}$ and $\mathsf{T}\tilde{\mathsf{v}} = \mathsf{v}$ . Note that the existence and uniqueness of these vectors follows from the fact that $\mathsf{T}$ is invertible.

Representation of linear maps

Let us now specialize the discussion to the case when $V$ and $W$ are finite dimensional inner product spaces of dimension $m$ and $n$ , respectively. For notational simplicity, we will denote the inner products in both $V$ and $W$ with the same symbol $\cdot$ , with the meaning assumed to be evident from the context. Let $\mathsf{T}:V \to W$ be a linear map from $V$ into $W$ , and let $(\mathsf{e}_1, \ldots, \mathsf{e}_m)$ and $(\mathsf{f}_1, \ldots, \mathsf{f}_n)$ be orthonormal bases of $V$ and $W$ , respectively. Then, for any $\mathsf{v} \in V$ , we see from the linearity of $\mathsf{T}$ that $\mathsf{T}\mathsf{v} = \mathsf{T}\left(\sum_{i=1}^m v_i \mathsf{e}_i\right) = \sum_{i=1}^m v_i \,\mathsf{T}\mathsf{e}_i.$ Since $\mathsf{T}\mathsf{e}_i \in W$ , we can express it in terms of the basis $(\mathsf{f}_1, \ldots, \mathsf{f}_n)$ of $W$ as $\mathsf{T}\mathsf{e}_i = \sum_{j=1}^n T_{ji} \mathsf{f}_j,$ for some constants $T_{ji} \in \mathbb{R}$ for every $1 \le i \le m$ and $1 \le j \le n$ . Note the order in which the indices are placed! We can now exploit the availability of an inner product in both $V$ and $W$ to see that $T_{ij} = \mathsf{f}_i \cdot \mathsf{T}\mathsf{e}_j, \quad 1 \le i \le n, \; 1 \le j \le m.$ We now collect together all the constants $T_{ij}$ as an $n \times m$ matrix $[\mathsf{T}]$ whose $(i,j)^{\text{th}}$ entry is $[\mathsf{T}]_{ij} = T_{ij}$ . The elements $T_{ij}$ are called the components of $\mathsf{T}$ with respect to the chosen bases. The matrix $[\mathsf{T}]$ is called the matrix representation of $\mathsf{T}:V \to W$ with respect to the bases $(\mathsf{e}_i)$ of $V$ and $(\mathsf{f}_i)$ of $W$ .

Remark

Occasionally, the notation $[\mathsf{T}]_{(\mathsf{e}_i)}^{(\mathsf{f}_i)}$ will be used to denote the matrix corresponding to a given linear map $\mathsf{T}:V \to W$ with respect to bases $(\mathsf{e}_i)$ of $V$ and $(\mathsf{f}_i)$ of $W$ . When the choice of bases is evident from the context, the simpler notation $[\mathsf{T}]$ will also be used.

Example

Let us revisit the linear map $\mathsf\pi:\mathbb{R}^3 \to \mathbb{R}^2$ . With respect to the standard basis of $\mathbb{R}^3$ and $\mathbb{R}^2$ , $\mathsf\pi$ has the following matrix representation: $[\pi] = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix},$ as can be checked easily with a simple calculation.

To see the advantage of representing a linear map as a matrix, let $\mathsf{w} = \mathsf{T}\mathsf{v} \in W$ be the effect of the action of $\mathsf{T}:V \to W$ on $\mathsf{v} \in V$ . We can write this equation in component form, with respect to the orthonormal bases $(\mathsf{e}_1, \ldots, \mathsf{e}_m)$ and $(\mathsf{f}_1, \ldots, \mathsf{f}_n)$ of $V$ and $W$ , respectively, as $\sum w_i \mathsf{f}_i = \mathsf{w} = \mathsf{T}\mathsf{v} = \sum v_j \mathsf{T}\mathsf{e}_j = \sum T_{ij} v_j \mathsf{f}_i \quad\Rightarrow\quad w_i = \sum T_{ij}v_j.$ In matrix notation, the component form of the equation $\mathsf{w} = \mathsf{T}\mathsf{v}$ reads $\begin{bmatrix} w_1\\ \vdots\\ w_n \end{bmatrix} = \begin{bmatrix} T_{11} & \ldots & T_{1m}\\ \vdots & \ddots & \vdots\\ T_{n1} & \ldots & T_{nm} \end{bmatrix} \begin{bmatrix} v_1\\ \vdots\\ v_m \end{bmatrix}.$ We thus see that $\mathsf{w} = \mathsf{T}\mathsf{v} \quad\Rightarrow\quad [\mathsf{w}] = [\mathsf{T}][\mathsf{v}].$ The special choice of the placements of indices for $T_{ij}$ is done so as to ensure a neat matrix equation of the form $[\mathsf{w}] = [\mathsf{T}][\mathsf{v}]$ for the components of the various quantities with respect to chosen bases.

Remark

The foregoing discussion can be summarized through the following commutative diagram,

Commutative diagram illustrating the basis representation of linear maps

Notice how we have interpreted the matrix $[\mathsf{T}]$ as a map of the form $[\mathsf{T}]:\mathbb{R}^m \to \mathbb{R}^n$ that acts on a column vector of size $m$ to produce a column vector of size $n$ . This map is called the representation of $\mathsf{T}$ with respect to the chosen bases.

Let us now consider the case when $V$ and $W$ are equipped with general bases. Given two general bases $(\mathsf{f}_i)$ and $(\mathsf{g}_i)$ of $V$ and $W$ , respectively, the linear map $\mathsf{T}:V \to W$ can be represented as follows: fo any $\mathsf{u} \in V$ , note that $\mathsf{T}\mathsf{u} = \mathsf{T}\left(\sum_{i=1}^n u_i \mathsf{f}_i\right) = \sum u_i \mathsf{T}\mathsf{f}_i.$ Since $\mathsf{T}\mathsf{f}_i \in W$ , it can be expressed in terms of the basis $(\mathsf{g}_i)$ of $W$ as $\mathsf{T}\mathsf{f}_j = \sum_{j=1}^m T_{ij} \mathsf{g}_i \quad\Rightarrow\quad T_{ij} = \mathsf{g}^i \cdot \mathsf{T}\mathsf{f}_j.$ Defining $\mathsf{v} = \mathsf{T}\mathsf{u}$ , it follows that $\mathsf{v} = \mathsf{T}\mathsf{u} = \sum T_{ij} u_j \mathsf{g}_i \qquad\Rightarrow\quad v_i = \sum T_{ij} u_j,$ where $\mathsf{v} = \sum v_i \mathsf{g}_i$ . Notice the similarity and difference with the corresponding expression for the components of $[\mathsf{T}]$ with respect to orthonormal bases in $V$ and $W$ .

Example

Let $\mathcal{P}_2([-1,1],\mathbb{R})$ and $\mathcal{P}_1([-1,1],\mathbb{R})$ represent real-valued polynomials of orders $2$ and $1$ , respectively, defined on the closed interval $[-1,1]\subseteq\mathbb{R}$ . Given any $p, q \in \mathcal{P}_2([-1,1],\mathbb{R})$ , define their inner product as $p \cdot q = \int_{-1}^1 p(x)q(x) \,dx.$ The inner product on $\mathcal{P}_1([-1,1],\mathbb{R})$ is similarly defined. Consider the bases $(e_1, e_2, e_3)$ and $(f_1, f_2)$ of $\mathcal{P}_2([-1,1],\mathbb{R})$ and $\mathcal{P}_1([-1,1],\mathbb{R})$ , respectively, defined as follows: for any $x \in [-1,1]$ , $e_1(x) = f_1(x) = \frac{1}{\sqrt{2}}, \quad e_2(x) = f_2(x) = \sqrt{\frac{3}{2}}x, \quad e_3(x) = \frac{\sqrt{5}}{2\sqrt{2}}(3x^2 - 1).$ It is not hard to check that $(e_i)$ and $(f_i)$ are orthonormal bases of $\mathcal{P}_2([-1,1],\mathbb{R})$ and $\mathcal{P}_1([-1,1],\mathbb{R})$ , respectively.

Remark

The basis $(e_1, e_2, e_3)$ is obtained by applying the Gram-Schmidt orthogonalization procedure to the basis $(\tilde{e}_1, \tilde{e}_2, \tilde{e}_3)$ defined as follows: for any $x \in [-1,1]$ , $\tilde{e}_1(x) = 1$ , $\tilde{e}_2(x) = x$ , and $\tilde{e}_3(x) = x^2$ . The reader is invited to verify this.

Let us now consider the map $D:\mathcal{P}_2(\mathbb{R},\mathbb{R}) \to \mathcal{P}_1(\mathbb{R},\mathbb{R})$ as follows: for any $p \in \mathcal{P}_2(\mathbb{R},\mathbb{R})$ defined as $x \in \mathbb{R} \mapsto p(x) = a_0 + a_1x + a_2x^2$ , $Dp(x) = \frac{dp(x)}{dx} = a_1 + 2a_2x.$ It is left as an easy exercise to verify that $D:\mathcal{P}_2(\mathbb{R},\mathbb{R}) \to \mathcal{P}_1(\mathbb{R},\mathbb{R})$ is a linear map.

We can work out the representation $[D]$ of $D$ with respect to the orthonormal bases defined earlier as follows: $[D]_{ij} = f_i \cdot De_j$ , for $i = 1,2$ and $j = 1,2,3$ . Doing the computation, we see that $= \begin{bmatrix} 0 & \sqrt{3} & 0\\ 0 & 0 & \sqrt{15} \end{bmatrix}$ As an illustration, the computation of $[D]_{22}$ and $[D]_{23}$ are shown below: $\begin{split} [D]_{22} &= \int_{-1}^1 f_2(x) De_2(x)\,dx = \int_{-1}^1 \frac{3}{2}x \, dx = 0,\\ [D]_{23} &= \int_{-1}^1 f_2(x) De_3(x)\,dx = \int_{-1}^1 \frac{3}{2}\sqrt{15}\,x^2\,dx = \sqrt{15}. \end{split}$ The other components are computed similarly.

To check that this representation is valid, let us verify that if $p \in \mathcal{P}_2([-1,1],\mathbb{R})$ and $q = Dp \in \mathcal{P}_1([-1,1],\mathbb{R})$ , then $[q] = [D][p]$ . Suppose, without loss of generality, that for any $x \in [-1,1]$ , $p(x) = a_0 + a_1 x + a_2 x^2$ , for some $a_0, a_1, a_2 \in \mathbb{R}$ . We can compute the representation of $p$ with respect to the orthonormal basis $(e_i)$ of $\mathcal{P}_2([-1,1],\mathbb{R})$ as follows: $p = \sum b_i e_i, \quad b_i = p \cdot e_i.$ A straightforward calculation shows that $p = \left(\sqrt{2}a_0 + \frac{2}{3\sqrt{2}}a_2\right) e_1 + \sqrt{\frac{2}{3}}a_1 e_2+ \frac{4}{3\sqrt{10}}a_2 e_3, \quad\Rightarrow\quad [p] = \begin{bmatrix}\left(\sqrt{2}a_0 + \frac{2}{3\sqrt{2}}a_2\right)\\ \sqrt{\frac{2}{3}}a_1\\ \frac{4}{3\sqrt{10}}a_2\end{bmatrix}.$ It is likewise easy to compute the representation of $Dp$ with respect to the basis $(f_1, f_2)$ of $\mathcal{P}_1([-1,1],\mathbb{R})$ as $Dp = \sum c_i f_i, \;\; c_i = Dp \cdot f_i, \quad\Rightarrow\quad [Dp] = \begin{bmatrix}\sqrt{2}a_1\\ \frac{2\sqrt{2}}{\sqrt{3}}a_2 \end{bmatrix}.$ We therefore verify by simple matrix multiplication that $[Dp] = [D][p]$ : $\begin{bmatrix}\sqrt{2}a_1\\ \frac{2\sqrt{2}}{\sqrt{3}}a_2 \end{bmatrix} = \begin{bmatrix} 0 & \sqrt{3} & 0\\ 0 & 0 & \sqrt{15} \end{bmatrix} \begin{bmatrix}\left(\sqrt{2}a_0 + \frac{2}{3\sqrt{2}}a_2\right)\\ \sqrt{\frac{2}{3}}a_1\\ \frac{4}{3\sqrt{10}}a_2\end{bmatrix}.$

Remark

When learning linear algebra for the first time, it is strongly recommended to work out all the details of this example step by step since it covers many of the concepts introduced earlier.

Given three vector spaces $U, V, W$ , let us consider the successive action of two linear maps $\mathsf{T}:U \to V$ and $\mathsf{S}:V \to W$ on a vector $\mathsf{u} \in U$ . We define the product map $\mathsf{ST}:U \to W$ as $(\mathsf{ST})(\mathsf{u}) = \mathsf{S}(\mathsf{T}\mathsf{u})$ . It can be shown that this translates to $[\mathsf{ST}] = [\mathsf{S}][\mathsf{T}]$ in matrix notation, with respect to any choice of bases in $U, V, W$ . To see this, let $(\mathsf{e}_i), (\mathsf{f}_i), (\mathsf{g}_i)$ be general bases of $U, V, W$ , respectively. Then, for any $\mathsf{u} \in U$ , we see that $\begin{split} \mathsf{ST}\mathsf{u} &= \mathsf{S}\left(\mathsf{T}\left(\sum u_i \mathsf{e}_i\right)\right)\\ &= \mathsf{S}\left(\sum T_{ij} u_j \mathsf{f}_i\right)\\ &= \sum S_{ij} T_{jk} u_k \mathsf{g}_i. \end{split}$ The matrix product in the right hand side of this equation thus corresponds to the familiar matrix multiplication. Combining the expression just derived with the definition $[\mathsf{ST}]_{ij} = \mathsf{g}_i \cdot \mathsf{ST}\mathsf{e}_j$ , we see that $(\mathsf{ST})_{ij} = \sum S_{ik}T_{kj} = ([\mathsf{S}][\mathsf{T}])_{ij} \quad\Rightarrow\quad [\mathsf{ST}] = [\mathsf{S}][\mathsf{T}].$ In fact, the rationale for defining matrix multiplication in the specific way it is defined is to ensure that the matrix representation of the product map is the product of the matrix representations of the individual maps.

Remark

Some authors write $\mathsf{S} \cdot \mathsf{T}$ for the product map which we denote as $\mathsf{ST} \equiv \mathsf{S} \circ \mathsf{T}$ . We will reserve the $\cdot$ symbol almost exclusively for the inner product, and hence will write the product of $\mathsf{S}$ and $\mathsf{T}$ as $\mathsf{ST}$ . It’s a good idea to be conscious of the specific notational choices whenever you consult other references.

Change of basis for linear maps

The representation of a linear map $\mathsf{T}:V \to W$ between finite dimensional inner product spaces $V$ and $W$ , of dimension $m$ and $n$ , respectively, as a map $[\mathsf{T}]:\mathbb{R}^m \to \mathbb{R}^n$ depends on the choice of bases for both $V$ and $W$ . Specifically, recall that if $(\mathsf{f}_i)$ and $(\mathsf{g}_i)$ are general bases of $V$ and $W$ , respectively, then the matrix representation of $\mathsf{T}$ is computed using the relation $T_{ij} = \mathsf{g}^i \cdot \mathsf{T} \mathsf{f}_j.$ Suppose now that $(\tilde{\mathsf{f}}_i)$ and $(\tilde{\mathsf{g}}_i)$ are another choice of general bases for $V$ and $W$ , respectively. Then the components of the matrix representation of $\mathsf{T}$ , written $\tilde{T}_{ij}$ , are computed as $\tilde{T}_{ij} = \tilde{\mathsf{g}}^i \cdot \mathsf{T}\tilde{\mathsf{f}}_j.$ The equations relating $\tilde{T}_{ij}$ and $T_{ij}$ are now worked out. Though these calculations take a much simpler form with respect to a choice of orthonormal bases for both $V$ and $W$ , the slightly more involved case involving general bases is presented below as a good algebraic exercise. Let the new bases $(\tilde{\mathsf{f}}_i)$ of $V$ and $(\tilde{\mathsf{g}}_i)$ of $W$ depend on the old bases $(\mathsf{f}_i)$ of $V$ and $(\mathsf{g}_i)$ of $W$ as $\tilde{\mathsf{f}}_i = \sum A_{ji}\mathsf{f}_j, \qquad \tilde{\mathsf{g}}_i = \sum B_{ji} \mathsf{g}_j,$ where $A_{ij} = \mathsf{f}^i \cdot \tilde{\mathsf{f}}_j$ and $B_{ij} = \mathsf{g}^i \cdot \tilde{\mathsf{g}}_j$ . It follows then from an easy computation that $\begin{split} \tilde{T}_{ij} &= \tilde{\mathsf{g}}^i \cdot \mathsf{T}\tilde{\mathsf{f}}_j\\ &= \sum \tilde{g}^{ik}\tilde{\mathsf{g}}_k \cdot \mathsf{T}\tilde{\mathsf{f}}_j\\ &= \sum \tilde{g}^{ik}B_{ck}\,\mathsf{g}_c \cdot \mathsf{T}\mathsf{f}_b \,A_{bj}\\ &= \sum \tilde{g}^{ik}B_{ck}g_{ca}\,\mathsf{g}^a \cdot \mathsf{T}\mathsf{f}_b \,A_{bj}\\ &= \sum \tilde{g}^{ik}B_{ck}g_{ca}T_{ab}A_{bj}, \end{split}$ where $g_{ij} = \mathsf{g}_i \cdot \mathsf{g}_j$ and $\tilde{g}^{ij} = \tilde{\mathsf{g}}^i \cdot \tilde{\mathsf{g}}^j$ . Noting that $\begin{split} \sum \tilde{g}^{ik}B_{ck}g_{ca} &= \sum g_{ac}B_{ck}\tilde{g}^{ki}\\ &= \sum g_{ac} \mathsf{g}^c \cdot \tilde{\mathsf{g}}_k \tilde{g}^{ki}\\ &= \mathsf{g}_a \cdot \tilde{\mathsf{g}}^i, \end{split}$ and that $\tilde{\mathsf{g}}_i = \sum B_{ji}\mathsf{g}_j \quad\Rightarrow\quad \mathsf{g}_i = \sum B^{-1}_{ji}\tilde{\mathsf{g}}_j \quad\Rightarrow\quad B^{-1}_{ij} = \mathsf{g}_j \cdot \tilde{\mathsf{g}}^i,$ it follows at once that $\tilde{T}_{ij} = \sum B^{-1}_{ia} T_{ab} A_{bj}.$ Using matrix notation, we can write the foregoing equation in matrix form as $[\mathsf{T}]_{(\tilde{\mathsf{f}}_i)}^{(\tilde{\mathsf{g}}_i)} = \mathsf{B}^{-1}\;[\mathsf{T}]_{(\mathsf{f}_i)}^{(\mathsf{g}_i)}\;\mathsf{A},$ where $\mathsf{A}$ and $\mathsf{B}$ are the matrices whose $(i,j)^{\text{th}}$ component is $A_{ij}$ and $B_{ij}$ , respectively.

Remark

The foregoing discussion is summarized in the following commutative diagram:

Change of basis for linear maps Commutative diagram illustrating change of basis rules for linear maps

Notice how the commutative diagram neatly summarizes the change in the representation of a linear map upon change of bases in the domain and codomain vector spaces.

In the special case when all the bases are orthonormal, the transformation matrices are orthogonal. In this case, we can write the transformation rule as follows: $[\mathsf{T}]_{(\tilde{\mathsf{f}}_i)}^{(\tilde{\mathsf{g}}_i)} = \mathsf{B}^T\;[\mathsf{T}]_{(\mathsf{f}_i)}^{(\mathsf{g}_i)} \; \mathsf{A}.$ This special case will turn out to be very useful in later applications.

Example

Let us revisit the projection map $\pi:\mathbb{R}^3 \to \mathbb{R}^2$ . We saw earlier that with respect to the standard bases of $\mathbb{R}^3$ and $\mathbb{R}^2$ , the linear map $\pi$ has the following representation: $[\pi] = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\end{bmatrix}.$ Let us now consider a different set of bases $(\mathsf{g}_1, \mathsf{g}_2)$ of $\mathbb{R}^2$ and $(\mathsf{g}_1, \mathsf{g}_2, \mathsf{g}_3)$ of $\mathbb{R}^3$ , where $\mathsf{g}_3 = \mathsf{e}_3$ , and $(\mathsf{g}_1,\mathsf{g}_2)$ is obtained by rotating $(\mathsf{e}_1,\mathsf{e}_2)$ by an angle $\theta$ about $\mathsf{e}_3$ . Thus, we have the following relations: $\begin{split} \mathsf{g}_1 &= \cos \theta \mathsf{e}_1 + \sin \theta \mathsf{e}_2,\\ \mathsf{g}_2 &= -\sin \theta \mathsf{e}_1 + \cos \theta \mathsf{e}_2. \end{split}$ The components $(\tilde\pi_{ij})$ of $\pi$ with respect to the bases $(\mathsf{g}_i)$ of $\mathbb{R}^3$ and $\mathbb{R}^2$ are computed using $\tilde\pi_{ij} = \mathsf{g}_i \cdot \pi\mathsf{g}_j.$ A simple calculation shows that $[\pi]_{(\mathsf{g}_i)}^{(\mathsf{g}_i)} = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0. \end{bmatrix}$ The change of basis can also be computed using the relation $[\pi]_{(\mathsf{g}_i)}^{(\mathsf{g}_i)} = \mathsf{B}^{-1} [\pi]_{(\mathsf{e}_i)}^{(\mathsf{e}_i)} \mathsf{A},$ where $\mathsf{A}$ and $\mathsf{B}$ are the matrices that related the new and old bases in $\mathbb{R}^3$ and $\mathbb{R}^2$ , respectively. It is left as a simple exercise to check that this yields the same representation as shown above.

Notice that in this special case, the representation of $\pi$ does not change for any value of $\theta$ . This is not true for a general linear map. Can you think of a simple geometric interpretation of this invariance of the representation of $\pi$ ?

Tensor product basis for $L(V,W)$

Suppose that $V$ and $W$ are inner product spaces of dimension $m$ and $n$ , respectively. Let us focus on the set $L(V,W)$ , $L(V,W) = \{\mathsf{T}:V \to W \,|\, \mathsf{T} \text{ is linear }\},$ of all linear maps from $V$ into $W$ . We will now show that the set of all linear maps $L(V,W)$ from $V$ into $W$ is also a vector space, and study a particularly useful basis called the tensor product basis for $L(V,W)$ .

Defining addition $+:L(V,W) \times L(V,W) \to L(V,W)$ and scalar multiplication $\cdot:\mathbb{R} \times L(V,W) \to L(V,W)$ as $(\mathsf{S} + \mathsf{T})(\mathsf{u}) = \mathsf{S}\mathsf{u} + \mathsf{T}\mathsf{u}, \quad (a\mathsf{T})(\mathsf{u}) = a \, \mathsf{T}\mathsf{u},$ for any $\mathsf{S}, \mathsf{T} \in L(V,W)$ and $\mathsf{u} \in V$ , it is easy to check that the set $L(V,W)$ has the structure of a real vector space - the two operations introduced above satisfy all the axioms of a real vector space listed earlier.

Remark

It is possible to define a norm on $L(V,W)$ as $\lVert \mathsf{T} \rVert = \text{sup}_{\lVert \mathsf{v} \rVert = 1} \lVert \mathsf{T}\mathsf{v} \rVert$ , for any $\mathsf{T} \in L(V,W)$ and $\mathsf{v} \in V$ . This is also called the sup norm. It is easy to show that $\lVert \mathsf{T}\mathsf{v} \rVert \le \lVert \mathsf{T} \rVert \lVert \mathsf{v} \rVert$ for any $\mathsf{T} \in L(V,W)$ and $\mathsf{v} \in V$ . An inner product for $L(V,W)$ will be introduced later using the trace of a linear map.

What is the dimension of $L(V,W)$ ? To answer this question, it is helpful to introduce the notion of a tensor product map. Given vectors $\mathsf{v} \in V$ and $\mathsf{w} \in W$ , the tensor product map $\mathsf{w} \otimes \mathsf{v} \in L(V,W)$ is defined as follows: for any $\mathsf{u} \in V$ , $(\mathsf{w} \otimes \mathsf{v})(\mathsf{u}) = (\mathsf{v} \cdot \mathsf{u}) \mathsf{w}.$ It is easily checked that this is in fact a linear map. It is convenient to consider first the special case when $V$ and $W$ are equipped with orthonormal bases. Let $(\mathsf{f}_i)_{i=1}^m$ and $(\mathsf{g}_i)_{i=1}^n$ be orthonormal bases of $V$ and $W$ , respectively. Let us study the $mn$ linear maps $\mathsf{g}_i \otimes \mathsf{f}_j:V \to W,$ for every $1 \le i \le n$ and $1 \le j \le m$ . Note that for any $\mathsf{v} = \sum v_i \mathsf{f}_i \in V$ , $(\mathsf{g}_i \otimes \mathsf{f}_j)(\mathsf{v}) = (\mathsf{v} \cdot \mathsf{f}_j) \mathsf{g}_i = v_j \mathsf{g}_i.$ It is easily checked that the $mn$ maps $\mathsf{g}_i \otimes \mathsf{f}_j \in L(V,W)$ are linearly independent. Indeed, if for real numbers $\{a_{ij}\}$ , where $1 \le i \le n$ and $1 \le j \le m$ , it is the case that $\sum a_{ij} \mathsf{g}_i \otimes \mathsf{f}_j = 0,$ then, for every $1 \le k \le m$ , $\sum a_{ij} (\mathsf{g}_i \otimes \mathsf{f}_j)(\mathsf{f}_k) = 0 \quad\Rightarrow\quad \sum a_{ik} \mathsf{g}_i = 0 \quad\Rightarrow\quad a_{ik} = 0.$ This shows that the $mn$ linear maps $\{\mathsf{g}_i \otimes \mathsf{f}_j\}$ are linearly independent. Notice how the orthonormality of the basis $(\mathsf{f}_i)$ of $V$ and the linear independence of the basis $(\mathsf{g}_i)$ of $W$ is used in proving this.

Suppose now that $\mathsf{T} \in L(V,W)$ is any linear map. Then for any $\mathsf{v} = \sum v_i \mathsf{f}_i \in V$ , we have $\mathsf{T}\mathsf{v} = \sum T_{ij} v_j \mathsf{g}_i = \sum T_{ij} (\mathsf{g}_i \otimes \mathsf{f}_j)(\mathsf{v}) = \left(\sum T_{ij} \mathsf{g}_i \otimes \mathsf{f}_j\right)\mathsf{v}.$ Since this is true for any $\mathsf{v} \in V$ , we get the following identity: $\mathsf{T} = \sum T_{ij} \mathsf{g}_i \otimes \mathsf{f}_j.$ This informs us that $\text{span}(\{\mathsf{g}_i \otimes \mathsf{f}_j\}) = L(V,W)$ .

The preceding two facts show that the $mn$ maps $(\mathsf{g}_i \otimes \mathsf{f}_j)$ indeed form a basis of $L(V,W)$ , called the tensor product basis of $L(V,W)$ . Since there are $mn$ such maps, we see that the dimension of $L(V,W)$ is $mn$ : $\text{dim}(L(V,W)) = \text{dim}(V) \text{dim}(W).$ Thus, $L(V,W)$ is a finite dimensional vector space with dimension equal to the product of the dimensions of $V$ and $W$ .

Remark

The change of basis rule derived earlier for the components of a linear map can also be derived using the tensor product representation. It is a simple exercise to verify this.

Example

Consider the identity map $\text{id}:\mathbb{R}^3 \to \mathbb{R}^3$ defined as follows: for any $\mathsf{x} \in \mathbb{R}^3$ , $\text{id} (\mathsf{v}) = \mathsf{v}.$ It is trivial to check that this is a linear map. The components $\text{id}_{ij}$ of $\text{id}$ with respect to the standard basis of $\mathbb{R}^3$ is obtained as follows: $\pi_{ij} = \mathsf{e}_i \cdot \text{id}(\mathsf{e}_j) = \delta_{ij}.$ We thus see that $\begin{split} \text{id} &= \sum \delta_{ij} \mathsf{e}_i \otimes \mathsf{e}_j\\ &= \sum \mathsf{e}_i \otimes \mathsf{e}_i. \end{split}$ This representation of the identity map is very useful in applications.

Example

Considering again the projection map $\pi:\mathbb{R}^3 \to \mathbb{R}^2$ introduced earlier, note that, with respect to the standard bases of $\mathbb{R}^3$ and $\mathbb{R}^2$ , we have $\pi = \sum \pi_{ij} \mathsf{e}_i \otimes \mathsf{e}_j = \mathsf{e}_1 \otimes \mathsf{e}_1 + \mathsf{e}_2 \otimes \mathsf{e}_2.$ Compare the form of this representation with the representation of the identity map in the previous example. Notice how we can read off the fact that $\pi$ projects the first two components of the vector it acts on based on this analogy.

Let us briefly look at the representation of any $\mathsf{T} \in L(V,W)$ with respect to general bases $(\mathsf{f}_i)$ of $V$ and $(\mathsf{g}_i)$ . Given any any $\mathsf{v} \in V$ , $\begin{split} \mathsf{T}\mathsf{v} &= \sum T_{ij} v_j \mathsf{g}_i\\ &= \sum T_{ij} (\mathsf{f}^j \cdot \mathsf{v}) \mathsf{g}_i\\ &= \sum T_{ij} (\mathsf{g}_i \otimes \mathsf{f}^j)(\mathsf{v}). \end{split}$ Since this is true for any $\mathsf{v} \in V$ , it is evident that $\mathsf{T} = \sum T_{ij} \mathsf{g}_i \otimes \mathsf{f}^j.$ Notice how the reciprocal basis shows up when using general bases.

Remark

Note that the linear map $\mathsf{T}:V \to W$ can be represented in a number of equivalent ways with respect to general bases $(\mathsf{f}_i)$ of $V$ and $(\mathsf{g}_i)$ of $W$ as follows: $\mathsf{T} = \sum T_{ij} \mathsf{g}_i \otimes \mathsf{f}^j = \sum T_{ij}f^{jk} \mathsf{g}_i \otimes \mathsf{f}_k = \sum g_{ki}T_{ij}\mathsf{g}^k \otimes \mathsf{f}^j = \sum g_{ki}T_{ij}f^{jl} \mathsf{g}^k \otimes \mathsf{f}_l.$ The default representation will be chosen in these notes as $\mathsf{T} = \sum T_{ij} \mathsf{g}_i \otimes \mathsf{f}^j$ , but this is merely a matter of convention.

Transpose of a linear map

Given a linear map $\mathsf{T}:V \to W$ between finite dimensional inner product spaces, we will now construct an important linear map, $\mathsf{T}^T:W \to V$ , called the transpose of $\mathsf{T}$ as follows: for any $\mathsf{v} \in V$ and $\mathsf{w} \in W$ , $\mathsf{T}^T\mathsf{w} \cdot \mathsf{v} = \mathsf{w} \cdot \mathsf{T}\mathsf{v}.$ To get a handle on this definition and relate it to the more elementary notion of the transpose of a matrix, let us consider the representation of $\mathsf{T}^T$ with respect to orthonormal bases of $V$ and $W$ . Given an orthonormal basis $(\mathsf{f}_i)$ of $V$ and $(\mathsf{g}_i)$ of $W$ , we can easily compute the representation $\mathsf{T}^T$ with respect to these bases as follows: if $\mathsf{v} = \sum v_i \mathsf{f}_i$ and $\mathsf{w} = \sum w_j \mathsf{g}_j$ , then $\sum T^T_{ij} w_j v_i = T_{ji} w_j v_i \quad\Rightarrow\quad T^T_{ij} = T_{ji}.$ Here $T_{ij} = \mathsf{g}_i \cdot \mathsf{T}\mathsf{f}_j$ and $T^T_{ij} = \mathsf{f}_i \cdot \mathsf{T}^T\mathsf{g}_j$ . In matrix notation, this amounts to the following equation: $[\mathsf{T}^T] = [\mathsf{T}]^T.$ We thus recover the familiar expression for the transpose of a matrix. The fact that $[\mathsf{T}^T]_{ij} = T_{ji}$ also leads to the following representation of $\mathsf{T}^T \in L(W,V)$ with respect to the bases $(\mathsf{f}_i)$ and $(\mathsf{g}_i)$ of $V$ and $W$ , respectively: $\begin{split} \mathsf{T}^T &= \sum T^T_{ij} \mathsf{f}_i \otimes \mathsf{g}_j,\\ &= \sum T_{ji} \mathsf{f}_i \otimes \mathsf{g}_j. \end{split}$ In the special case of the linear map $\mathsf{w} \otimes \mathsf{v} \in L(V,W)$ , where $\mathsf{v} \in V$ and $\mathsf{w} \in W$ , we see from this equation that $(\mathsf{w} \otimes \mathsf{v})^T = \sum [\mathsf{w} \otimes \mathsf{v}]_{ji} \mathsf{f}_i \otimes \mathsf{g}_j = \sum (v_i \mathsf{f}_i) \otimes (w_j \mathsf{g}_j) = \mathsf{v} \otimes \mathsf{w}.$ Note that the equation $(\mathsf{w} \otimes \mathsf{v})^T = (\mathsf{v} \otimes \mathsf{w})$ is valid in general, even though orthonormal bases were used used to prove the fact. The reason is that the bases do not appear in the final form of this equation. Alternatively, the fact that $(\mathsf{w} \otimes \mathsf{v})^T = (\mathsf{v} \otimes \mathsf{w})$ can be directly checked using the definition of the transpose.

Let us now compute the representation of the transpose of a linear map $\mathsf{T}:V \to W$ with respect to general bases $(\mathsf{f}_i)$ for $V$ and $(\mathsf{g}_i)$ for $W$ . With respect to these choice of basis, we have: $\begin{split} \left(\sum v_i \mathsf{f}_i\right) \cdot \mathsf{T}^T\left(\sum w_j \mathsf{g}_j\right) &= \mathsf{T}\left(\sum v_i \mathsf{f}_i\right) \cdot \left(\sum w_j \mathsf{g}_j\right),\\ \sum v_i w_j \mathsf{f}_i \cdot \mathsf{T}^T\mathsf{g}_j &= \sum v_i w_j \mathsf{T}\mathsf{f}_i\cdot \mathsf{g}_j,\\ \sum v_i w_j g_{il} \mathsf{f}^l \cdot \mathsf{T}^T\mathsf{g}_j &= \sum v_i w_j g_{kj} \mathsf{g}^k \cdot \mathsf{T}\mathsf{f}_i,\\ \sum v_i w_j g_{il} T^T_{lj} &= \sum v_i w_j g_{kj} T_{ki}. \end{split}$ Since this equation holds true for any choice of $v_i$ and $w_j$ , it follows immediately that $\sum g_{ik} T^T_{kj} = \sum g_{jk}T_{ki}.$ In the special case when the bases $(\mathsf{f}_i)$ of $V$ and $(\mathsf{g}_i)$ of $W$ are orthonormal, the use of the relations $f_{ij} = \delta_{ij}$ and $g_{ij} = \delta_{ij}$ yields the familiar expression for the components of the transpose: $T^T_{ij} = T_{ji}$ .

Example

The properties of the transpose of a linear map parallel that of the transpose of matrices. For instance, if $\mathsf{S}, \mathsf{T} \in L(V, W)$ , and $a \in \mathbb{R}$ , then $(\mathsf{S} + \mathsf{T})^T = \mathsf{S}^T + \mathsf{T}^T, \qquad (a\mathsf{S})^T = a\,\mathsf{S}^T.$ Given linear maps $\mathsf{S}:V \to W$ and $\mathsf{T}:U \to V$ , we have the following relation: $(\mathsf{S}\mathsf{T})^T = \mathsf{T}^T \mathsf{S}^T.$ These properties are easy consequences of the definition of the transpose - it is left as an exercise to verify these claims.