Curvilinear Coordinates
Note
Much of what is written here is an overkill if the eventual goal is to just learn how to compute the expressions for divergence, curl, etc. in special orthogonal coordinate systems. The presentation below is however meant to encourage the reader to pursue a more detailed study of Riemannian geometry where many of the ideas developed here find a more elegant and natural generalization.
The notion of a general coordinate system was introduced in the previous chapter. All the coordinate computations thus far have been restricted to the global Cartesian coordinate system on . There are, however, many applications where the use of special coordinate systems significantly simplify the calculations. It is thus of interest to study how the calculus of tensor fields on can be developed within a general coordinate setting.
Before discussing the general case of tensor fields on , it is instructive to consider the case of a smooth vector field . Given any , the vector admits the following representation with respect to the Cartesian coordinate system on : The question that naturally arises is how to represent using a general coordinate system on . In particular, if each tangent space is equipped with a different basis, then the vector field can be locally expressed with respect to the corresponding basis. What will become evident soon is that given a general coordinate system on , it is possible to construct a natural set of basis vectors for each tangent space, called the coordinate basis, and which in turn can be used to build a systematic calculus using these curvilinear coordinates.
Remark
It is possible to choose basis vectors for each tangent space that are not derived from coordinates. Such a choice is called a frame, and the associated calculus is called Cartan's method of moving frames. We will not develop that here since coordinate bases are sufficient for our purposes.
Reciprocal basis vs Dual basis
Before we proceed further, it is helpful to clarify the difference between a reciprocal basis and a dual basis, since the distinction between these two distinct concepts is often not clear in the literature. To keep the discussion slightly more general, let us work in the context of a finite dimensional vector space $V$ of dimension $n$.
Let be a basis of $V$. Any vector $\mathsf{v} \in V$ admits a basis representation of the following form: Note that the components of $\mathsf{v}$ with respect to the basis $B$ are represented using superscripts. The dual space of $V$, written $V^\star$, is defined as the set of all linear functions defined on $V$: Elements of $V^\star$ are called dual vectors or covectors. Note that since $V$ is finite dimensional, elements of the dual space $V^\star$ are also continuous -- this is not true in general if $V$ is infinite dimensional. Using the linearity of $\omega \in V^\star$, we see at once that for any $\mathsf{v} \in V$, we have Introducing the special linear maps $\beta^i \in V^\star$, $i = 1, \ldots, n$, as for any $\mathsf{v} = \sum v^i \mathsf{b}_i \in V$, we see that Since this is true for every $\mathsf{v} \in V$, we have the following identity: where $\omega_i = \omega(\mathsf{b}_i) \in \mathbb{R}$. It is straightforward to establish that is a basis of the dual space $V^\star$, and is called the dual basis of $V^\star$ corresponding to the basis $B$ of $V$. This also shows that the dual space $V^\star$ is also a finite dimensional vector space with the same dimension as $V$.
An important property of the dual basis $B^\star$ which follows immediately from its definition is that This, is in fact, is the the property that defines a dual basis.
Given an arbitrary $\omega \in V^\star$, we can express it in terms of the dual basis $B^\star$ as Notice the use of subscripts to denote the components of $\omega$. Given any $\mathsf{v} \in V$, we see at once that The choice of subscript and superscript notation is chosen so that the component representation of natural pairings - like $\omega(\mathsf{v})$, which is also called a duality pairing, - has a neat component representation involving sums that have matching subscripts and superscripts.
Let us now recall the notion of a reciprocal basis introduced earlier. Consider the special case when $V$ has an inner product $\cdot:V \times V \to \mathbb{R}$. In this case, we can define the reciprocal basis as Note that the existence of the reciprocal basis is guaranteed by the defining properties of an inner product. Given an arbitrary $\mathsf{v} \in V$, we can express it both in terms of the basis $B$ and its reciprocal basis $\hat{B}$ as Note that the components of a vector with respect to a reciprocal basis are also represented using subscripts.
To understand the relation between dual and reciprocal basis, let us consider the basis $B$ of $V$, the reciprocal basis $\hat{B}$ of $V$, and the dual basis $B^\star$ of $V^\star$. Given $\omega \in V^\star$, the Riesz representation theorem tells us that we can find a $\mathsf{w} \in V$ such that, for any $\mathsf{v} \in V$, In fact, it is possible in this case to find this relationship explicitly. Let us first make the following definitions: It is straightforward to verify that and that the matrices whose components are $(b_{ij})$ and $(b^{ij})$ are inverses of each other. We then see that the equation is equivalent to Since this is true for all $\mathsf{v} \in V$, we see at once that We can also invert this relation to see that We thus see that there is a one-to-one correspondence between the vector space $V$ and its dual space $V^\star$ as a consequence of the inner product that arises by identifying each vector $\mathsf{w} \in V$ with the covector $\omega \in V^\star$ such that $\omega(\mathsf{v}) = \mathsf{w}\cdot\mathsf{v}$ for every $\mathsf{v} \in V$. In particular, the identity suggests as indentification of the dual basis covector $\beta^i$ with the reciprocal basis vector $\mathsf{b}^i$. This is the reason why the components of both covectors and components of the corresponding vectors with respect to the reciprocal basis are represented with subscripts, and why these two distinct notions are often identified in the literature. Since we will always work with inner product spaces in this course, it is sufficient to work just with the reciprocal basis. But the distinction between the dual basis and reciprocal basis should be kept in mind. The former is always defined, while the latter is defined only when an inner product is available.
For the special case of the standard basis of , note that the identity informs us that The dual space of $\mathbb{R}^3$, namely $(\mathbb{R}^3)^\star$, can be identified with $\mathbb{R}^3$ using the standard inner product. The dual basis in $(\mathbb{R}^3)^\star$, $(\mathsf{epsilon}^i)_{i=1}^3$ is defined such that For any $\mathsf{z} \in \mathbb{R}^3$, a simple calculation shows that The placement of the indices is thus irrelevant when dealing with Cartesian coordinate systems. This is the reason for working with subscripts throughout in the foregoing development. We will however try to follow the convention, in this chapter, that whenever we have twice repeating indices that are summed over, one of them will be a superscript and the other will be a subscript, even when dealing with Cartesian coordinate systems.
Notation
Since we will not employ the dual basis in the sequel, we will drop the hat when representing components of vectors (and tensors) with respect to the reciprocal basis. Thus, we will write $\mathsf{v} = \sum v_i \mathsf{b}^i = \sum v^i \mathsf{b}_i$.
Coordinate basis
Recall that a coordinate system on is a pair where is an open subset of and is a diffeomorphism from onto . The curvilinear coordinates of any are then defined as . Recall that this is a simplified notation for . The inverse of this relation is written as . It is convenient to introduce the notation to denote the image of under .
Remark
Note that depending on the choice of coordinate system, the triads of real numbers and could represent the same point in . Specifically, if , or if , then they indeed represent the same point. It is conventional to work with the the coordinates when working in a Cartesian coordinate setting, and the coordinates , for the same point, when using curvilinear coordinates.
A coordinate curve at is a map of the form , where for some , such that, for any , where . Note that for . The tangent vector to the coordinate curve at is defined as the coordinate tangent vector: Note that each . Further, the triad of tangent vectors is linearly independent: to see this, note that if for some real numbers , , then The implication in the equation above follows from the fact that the triad of vectors is linearly independent. Arranging these three equations in matrix form, it follows from the fact that that . This shows that the triad is a basis of , and is called the coordinate basis of with respect to the coordinate system . At any , it is easy to compute the scalars : Note that it is not the case, in general, that . This shows that the basis of is not necessarily an orthonormal basis.
Remark
The scalars $g_{ij}(\mathsf{y})$ are the components of a tensor field called the metric tensor, to be introduced shortly. The presence of a metric tensor gives what is called a Riemannian structure. It is possible to develop a more general theory without specifying a Riemannian structure, but we will not require that level of generality in this course.
Given a smooth vector field , the vector can now be expressed in terms of the corresponding coordinate basis , where , as follows: Notice how both the component fields and the basis vectors depend on . This is in contrast to the Cartesian representation of where the basis vectors do not depend on the tangent space under consideration. The change of basis rules discussed earlier can be used here to study how the curvilinear components of are related to its Cartesian components : if , where , then The inverse relation can be calculated by inverting this equation. This is more elegantly achieved by the use of the reciprocal basis to , which is presented next.
Reciprocal coordinate basis
The coordinate basis to the tangent space at is in general not orthornormal, as was noted in the previous section. It is therefore pertinent to compute its reciprocal basis to simplify various computations. A useful relation that helps in establishing many of the results in this section stems from the observation that the coordinate relations and are inverses of each other, whence Differentiating this expression with respect to immediately yields the identity Similarly, the identity yields the relation These two identies state the expected fact that the matrices whose entries are and are inverses of each other. These identities can be used to derive a variety of useful results. First, note that Second, note that This calculation shows at once that the vector defined as satisfy the relations It is thus evident that the vectors thus defined constitute the reciprocal basis to the coordinate basis of , and is called the reciprocal coordinate basis of .
The reciprocal coordinate basis of can be used to compute the scalars : It is straightforward to verify that In other words, the matrices whose entry is $g_{ij}(\mathsf{y})$ and $g^{ij}(\mathsf{y})$ are inverses of each other.
We also note for future reference the following symmetry relations: $g_{ij}(\mathsf{y}) = g_{ji}(\mathsf{y})$ and $g^{ij}(\mathsf{y}) = g^{ji}(\mathsf{y})$. These follow at once from the symmetry of the Euclidean inner product.
As an example of the usefulness of the reciprocal basis, the change in coordinate representation of a vector field is revisited. Suppose that is a smooth vector field on . The coordinate representation of using the Cartesian coordinate system and the coordinate system are related as follows: , where and . Using the reciprocal coordinate basis just introduced, it follows after a simple computation that This is the inverse of the relation obtained in the previous section.
Christoffel symbols
The discussion thus far can be summarized as follows: the choice of a coordinate system naturally suggests a choice of basis , where , for called the coordinate basis. A natural question to ask at this juncture is how the coordinate basis vectors as varies over . To answer this question, note that where . Introducing the Christoffel symbols as where . Note that each is a scalar field on , but they do not form the components of a third order tensor field. Using the Christoffel symbols, the foregoing equation can be written compactly as The Christoffel symbols thus provide a means to study how the coordinate basis vectors change when moving from one tangent space to a neighboring one. It also follows from this result that This equation clearly demonstrates that in the special case of the Cartesian coordinate system, the Christoffel symbols vanish identically since the basis vectors do not change when moving from one tangent space to another.
An entirely analogous set of calculations can be peformed to show that Thus the Christoffel symbols provide all the information necessary to compute how the basis vectors change as we move across tangent spaces.
Remark
In the general theory of differentiable manifolds, the appropriate generalization of the ideas discussed leads to the notion of a connection on the manifold.
It can be shown by a simple calculation that the components of the Christoffel symbols do not transform as a tensor upon change of coordinates. However, the quantity does indeed transform as a second tensor. The numbers $T^k_{ij}$ can indeed be shown to be the components of a third order tensor called the torsion tensor. For our purposes, it is sufficient to note that the torsion tensor identically vanishes since the Christoffel symbols possess the following symmetry: $\Gamma^k_{ij}(\mathsf{y}) = \Gamma^k_{ji}(\mathsf{y})$, as can be easily verified from the expression defining the Christoffel symbols (using equality of mixed partial derivatives).
Remark
In the general theory of differentiable manifolds, it is possible to construct connections whose torsion does not vanish.
Metric tensor
It is convenient at this point to introduce the metric tensor , where $U \subseteq \mathbb{R}^3$ is open, as follows: for any $x \in U$, Note that this is just the identity tensor field. In terms of another coordinate system $(U,\phi)$ on $U$ that maps $\mathsf{x}$ to $\mathsf{y}$, it is easily checked that the following identities hold: We will also use the notation $\mathsf{g}(\mathsf{y})$ to denote $\mathsf{g}(\mathsf{y}(\mathsf{x}))$.
Remark
Technically, the metric tensor has to be defined as a tensor of the form , where is the cotangent bundle on $U$. Since we are dealing throughout with finite dimensional Euclidean spaces, an implicit identification of the cotangent bundle with the tangent bundle is made in the foregoing definition. This is not a good idea in general - the simpler and more imprecise approach adopted here is sufficient for our purposes.
The idea behind the metric is to provide a smooth extension of the inner product over every tangent space. Indeed, for any , given any , it follows that
Introducing the notation for the elements of the Jacobian matrix for the change of coordinate system from to the Cartesian coordinate system, it follows from a simple calculation that Denoting by the determinant of the matrix whose entry is , and by the determinant of the matrix whose entry is , it follows from the previous equation that Note that in the equation above, . Similar expressions can be computed for the inverse transformation from the curvilinear coordinates $\mathsf{y}$ to the Cartesian coordinates $\mathsf{x}$.
Jacobi's formula
An important result that is quite useful in applications concerns the derivative of the determinant of $g(\mathsf{y})$ with respect to its entries $g_{ij}(\mathsf{y})$. Towards this end, consider an invertible matrix $\mathsf{A}$ whose (real-valued) entries are $a_{ij}$. We will denote the entries of its inverse $\mathsf{A}^{-1}$ as $a^{ij}$. Let $a$ be the determinant of $\mathsf{A}$. From elementary matrix algebra, we know that the inverse of the matrix is computed as Here, $\text{cof }\mathsf{A}$ denotes the cofactor of $\mathsf{A}$, and its transpose is called the adjoint of $\mathsf{A}$. We wish to compute the derivative of $a$ with respect to $a_{ij}$. Let us consier the Laplace expansion of the determinant: Differentiating this with respect to $a_{ij}$ and noting that does not involve $a_{ij}$, a simple calculation shows that This result is called Jacobi's formula.
Applying this to the matrix whose entries are we see at once that This result is quite useful in applications.
Gradient
Let be a smooth vector field over an open subset . Recall from the earlier discussion that all the key differential quantities related to are obtained from the covariant derivative of . Given a coordinate system , the curvilinear coordinate representation of the covariant derivative of is obtained as follows: for any and , with , In the derivation above, , where are constants such that . We will use the following special notation to denote the components of the covariant derivative of $\mathsf{v}$: Using this, the final expression for the covariant derivative just derived can rewritten as follows: It immediately follows from this equation that the gradient of the vector field can be written in curvilinear coordinates as In the expression above, .
Alternatively, starting with the representation , an analogous calculation yields Note the negative sign in the second term.
The foregoing calculations can be extended to tensor fields too. Rather than providing the corresponding expressions for a general tensor field, the special case of a second order tensor field is considered here. The starting point is, as before, the covariant derivative of at along . This is computed as follows: As before, we introduce the following special notation for the components of the covariant derivative of $\mathsf{A}$: The gradient of can be easily seen from the above calculation as Alternatively, starting with the representation a similar computation yields Similar representations can be obtained when representing $\mathsf{A}(\mathsf{x})$ using mixed components.
Gradient of the metric tensor
As an immediate application, let us compute the covariant derivative of the metric tensor $\mathsf{g}:U \to \otimes^2 TU$. It follows from the definition of the covariant derivative that since does not vary as $\mathsf{x}$ varies over $U$, that This is equivalent to stating that $g_{ij|k}(\mathsf{y}) = 0$, for any choice of coordinate system. Working out the details, we get The foregoing equation can alternatively be obtained by directly differentiating with respect to $\mathsf{y}$ and setting it to zero. A straightforward algebraic manipulation of this result yields the following useful relation: Alternatively, the validity of this expression can be checked by substituting the expression for the partial derivatives of $g_{ij}(\mathsf{y})$ with respect to $\mathsf{y}$ computed earlier.
Parallel transport
Given a curve $\mathsf{c}:I \subseteq \mathbb{R} \to U \subseteq \mathbb{R}^3$ and a vector field $\mathsf{v}:U \to TU$, we say that the vector field $\mathsf{v}$ is parallelly transported along the curve $\mathsf{c}$ if, for every $t \in I$, Intuitively, the idea is that the vector field is just shifted along the curve in a manner such that it is parallel to itself. In terms of a curvilinear coordinate system $(U,\phi)$ we can write the component version of the foregoing equation as Here $\dot{\hat{c}}^j(t)$ refers to the components of with respect to the coordinate basis . If it is true that the vector field $\mathsf{v}$ is parallelly transported by any arbitrary curve in $U$, then we can reduce the foregoing equation to the simpler condition that for any $\mathsf{x} \in U$, These ideas can be generalized to the case of a tensor field on $U$ along the same lines. Note in particular that the conclusions of the previous discussion can be equivalently rephrased as follows: the metric tensor $\mathsf{g}$ is parallelly transported by any curve in $U$.
Riemann curvature tensor
We introduced the notion of a covariant derivative as a generalization of the notion of a directional derivative. The question we want to ask is whether the covariant derivative commutes in the sense that mixed partial derivative commutes - the order in which we compute a mixed partial derivative doesn't matter. To answer this question, suppose that we are given a smooth vector field and two vectors $\mathsf{w}, \mathsf{z} \in T_{\mathsf{x}}U$ for some $\mathsf{x} \in U$. We want to compute the following quantity in order to understand the commutativity of the covariant derivative. Working in a curvilinear coordinate system $(U,\phi)$, a simple calculation shows that It is thus sufficient to study the effect of interchanging the order of second derivatives for the components of the covariant derivative. A straightforward calculations shows that We therefore see that The tensor $\mathsf{R}$ is called the Riemann curvature tensor. It is left as a straightforward (cumbersome though!) exercise to verify that the components $R^i_{ljk}$ indeed transform appropriately upon change of coordinates.
The tensorial nature of the Riemann curvature tensor implies that it is enough to evaluate it in any one particular coordinate system. Choosing the standard Cartesian coordinate system on $\mathbb{R}^3$, we see at once that $\mathbb{R} = \mathbb{0}$. The vanishing of the Riemann curvature tensor is equivalently stated as the flatness of the Euclidean space $\mathbb{R}^3$.
Divergence
The divergence of the smooth vector field is obtained by contracting the gradient: Notice how the expression for divergence derived here reduces to the standard Cartesian representation when the Christoffel symbols are set to zero.
Similar calculations can be carried out for tensor fields. The divergence of the second order tensor field considered in the previous is computed easily from this expression The curvilinear coordinate representation of the divergence of a tensor field of arbitrary order over can be computed using a straightforward extension of the ideas presented above.
A particularly convenient form of the divergence of a vector field is obtained by noting, from the relationship between the Christoffel symbols and the metric tensor derived earlier, that Using Jacobi's formula for the derivatives of the determinant, we see that We therefore see that Using this relation, we can express the divergence of a vector field compactly as follows: as can be verified by a trite calculation.
Curl
The curl of the vector field can be computed from its divergence using the definition , where and is a constant vector field. Employing a curvilinear coordinate system , the vector can be written as where . To evaluate , we use the definition of the Levi-Civita tensor on , and the definition of the determinant, to first establish the following result: This shows at once that Returning to the computation of the vector in curvilinear coordinates, we note that The divergence of can now be computed in curvilinear coordinates, using the expression derived in the previous section, as The curvilinear coordinate expression for the curl of the vector field follows at once from this calculation that Alternatively, a simpler expression is obtained by directly using the definition of divergence in terms of the covariant derivative. Skipping the lengthy algebraic details, the final expression for the curl of a vector field take the following compact form: The curl of tensor fields of higher order is computed along the same lines. The final expressions are not provided here since they have cumbersome algebraic forms in curvilinear coordinates.
Integration
We will now briefly discuss volume integration in curvilinear coordinates. Since the key ideas behind line and surface integrals were covered in a coordinate independent fashion in the previous chapter, we will not discuss them here - it is straightforward to extend the results presented here to cover them.
Suppose that is a scalar field on an open subset . Given a coordinate system on , the integral of over is defined as follows: In the equation displayed above denotes the matrix whose component is . This result follows from the change of variables formula for multiple integrals.
Special coordinate systems
We will now present results specializing the foregoing discussion to two special coordinate systems that are frequently encountered in applications. The details of the various calculations are skipped since they are consequences of a routine application of the theory developed so far to the special cases considered here.
Cylindrical polar coordinates
For problems that have a special axis of symmetry, it conventient to employ cylindrical polar coordinates. Suppose, for concreteness, that the axis of symmetry is the $x^3$-axis. The cylindrical polar coordinates are defined in terms of Cartesian coordinates as In practice, the notations $y^1 = r$, $y^2 = \theta$ and $y^3 = z$ are used for cylindrical polar coordintes. It is conventional to choose the coordinates such that $r \in (0,\infty)$, $\theta \in (0,2\pi)$, and $z \in \mathbb{R}$. The inverse of this transformation takes the form The domains of definition of the cylindrical coordinate system is not stated explicitly here, but can be inferred from the validity of the corresponding expressions.
The coordinate tangent vectors in the context of cylindrical polar coordinates take the following form: In the equations displayed above, the more convenient orthonormal basis is also introduced. The reciprocal basis vectors are easily computed as , , and .
The components of the metric tensor are given by The square root of the determinant of matrix, which is needed for volume integration, is easily computed as . The Christoffel symbols for cylindrical polar coordinates are given by
Suppose now that $f:\mathbb{R}^3 \to \mathbb{R}$ is a given scalar field. The gradient of $f$, expressed in cylindrical polar coordiantes, takes the following form:
Remark
In this and the next section, the arguments for the various functions will be suppressed to keep the equations readable. For instance, the foregoing equation is more properly expressed as It is left to the reader to fill in these arguments in the various equations listed in this section.
Given a smooth vector field $\mathsf{v}:\mathbb{R}^3 \to T\mathbb{R}^3$, we denote it in cylindrical polar coordinates as Note that contravariant components of $\mathsf{v}$, with respect to the basis , are related as $v^1 = v_r$, $v^2 = v_\theta/r$, and $v^3 = v_z$. The gradient of $\mathsf{v}$ takes the following form It follows at once from this expression that the divergence of $\mathsf{v}$ takes the following form: The curl of the vector field $\mathsf{v}$ is given by The corresponding expressions for tensor fields is not provided here, but can be computed analogously.
Spherical coordinates
For applications with radial symmetry, it is convenient to employ spherical coordinates. The spherical coordinates are defined in terms of Cartesian coordinates as In practice, the notations $y^1 = r$, $y^2 = \theta$, and $y^3 = \phi$ are used for spherical coordinates. The coordinates are chosen such that $r \in (0,\infty)$, $\theta \in (0,\pi)$, and $\phi \in (0,2\pi)$.
Note
Some authors use $\theta$ for $y^3$ and $\phi$ for $y^2$. There is also some variability as to the range of these coordinates.
The inverse of this transform takes the following form: As in the case of cylindrical polar coordinates, the domains of definition of these transformations are chosen in a manner such that the various expressions are well defined.
The coordinate tangent vectors for spherical coordinates are easily computed: The alternative, and more convenient notations are also introduced in the equations displayed above. The reciprocal basis is easily computed as , , and .
The components of the metric tensor are given by The square root of the determinant of matrix, which is needed for volume integration, is easily computed as . The Christoffel symbols for cylindrical polar coordinates are given by
If $f:\mathbb{R}^3 \to \mathbb{R}$ is a smooth scalar field, its gradient takes the following form in spherical coordinates: As before, we suppress the arguments of the functions to enhance readability.
Given a smooth vector field , we express it in spherical coordinates as The contravariant components of , with respect to the basis , are related as $v^1 = v_r$, $v^2 = v_\theta/r$, and $v^3 = v_\phi/(r\sin \theta)$. The gradient of is computed as The trace of the gradient immediately yields the divergence of as The expression for the curl of in spherical coordinates is given by the following expression: Similar expressions for tensors can be derived along the same lines.