How not to spill your friend's coffee: tensor invariance under coordinate system transformations
Although the word tensor is widely used to represent a multi-dimensional array, in physics and continuum mechanics a tensor has a very specific meaning. These are some definitions I have come across:
- "A tensor of rank \(n\) is an array of \(3^n\) values (in 3-D space) called "tensor components" that combine with multiple directional indicators (basis vectors) to form a quantity that does not vary as the coordinate system is changed." - A Student's Guide to Vectors and Tensors (Fleisch, 2011)
- "In mathematics, a tensor is an algebraic object that describes a (multilinear) relationship between sets of algebraic objects related to a vector space." - Wikipedia
- "A tensor of order two (second-order tensor) is a linear map that maps every vector into a vector..." - Wikipedia
When I first began studying continuum mechanics, tensors were a conundrum to me. I understood how to manipulate them algebraically but obtaining an intuitive understanding of them was a more difficult path. The purpose of this post is to simplify tensors for students who may have also run into the same roadblocks as I did. This post assumes familiarity with orthonormal bases, coordinate systems (frames), and measuring the position and orientation of one frame with respect to another.
The first simplification towards understanding tensors is to utilize only orthogonal coordinate systems. This obviates the need for covariant/contravariant components. Studying non-orthogonal coordinate systems is mathematically enlightening, but it is not a prerequisite to understanding tensors. Covariant/contravariant components are an accounting mechanism for non-orthogonal coordinate systems. Simple operations, such as computing the dot product of two vectors, are more computationally complex when non-orthogonal coordinate systems are involved. I emphasize computationally because the mathematical concept is still the same, but the mathematical mechanics are different. So from this point forward, we will utilize only orthogonal coordinate systems. By the way, A Student's Guide to Vectors and Tensors is a great introduction to non-orthogonal coordinate systems and tensors.
As the Wikipedia quote above mentions, a tensor is a mathematical object that expresses a multilinear relationship. Using tensor algebra, a multilinear relationship can be expressed as a linear relationship in a higher dimensional space. Therefore, we can make the statement that a tensor is a mathematical object that expresses a linear relationship. Although a (potentially multi-dimensional) matrix also expresses a linear relationship, tensors and matrices are not equivalent concepts. The components of a matrix can represent a tensor, but there exists infinitely many matrices that also represent the same tensor. The components of the matrix that represent a tensor, will depend upon the set of basis vectors that are chosen for computation. As is customary nomenclature, I will from now on refer to the components of the matrix that represent a tensor as simply the components of a tensor. A key principle to keep in mind is that the components of a tensor, by themselves, do not mean anything. These components have to be tied to basis vectors. No "universal" set of basis vectors exists, all sets of basis vectors are equally valid. To concretize the concept of tensors I will utilize an admittedly silly, but I hope useful, example.
Imagine that you have found a fool-proof method of drinking coffee that assures that not one drop will ever be spilled. You want to share this method with the world. After much consideration, you decide that only the precision of a mathematical treatise will do it justice. The key to this method is to first rotate the coffee cup to the right side of your body, then bend your elbow to bring the coffee cup to your mouth (I apologize to left-handed folks out there for picking sides, but I did have to pick a side and I am right-handed). Let's now explore how we could describe this procedure mathematically. First, let's establish an orthogonal coordinate system centered at your shoulder joint. A coordinate system is the combination of an origin (shoulder joint center), and an orthonormal basis so we define the following vectors:
- \(\vec{b_1}\) to point forward
- \(\vec{b_2}\) to point up
- \(\vec{b_3}\) to point to the right
Given these vectors, we form an orthonormal basis: \(B=\{\vec{b_1}, \vec{b_2}, \vec{b_3}\}\). Note that this definition is completely arbitrary, but now affords us the ability to make measurements. The coffee drinking procedure begins by extending your right arm forward, picking up the coffee cup from your desk, and holding it upright (obviously), directly in front of you, with the handle pointing to your right (laterally). This is state 1. We also create and attach an orthonormal basis to the coffee cup, say at its center of gravity, and define \(C=\{\vec{c_1}, \vec{c_2}, \vec{c_3}\}\):
- \(\vec{c_1}\) points toward the front of the mug (forward)
- \(\vec{c_2}\) points toward the top of the mug (up)
- \(\vec{c_3}\) points toward the handle (right)
The orientation of the coffee cup, with respect to the shoulder coordinate system, in state 1 is:
\[ ^BR_{C_1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]
Next, you rotate your arm outward so you are holding the coffee cup upright, directly to your right, with the handle pointing behind you (posteriorly). This is state 2 and the orientation of the coffee cup is:
\[ ^BR_{C_2} = \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \]
Let's now compute the rotation matrix that transforms the coffee mug from its orientation in state 1, \(^BR_{C_1}\), to its orientation in state 2, \(^BR_{C_2}\). We will call it \(R^Y\), meaning the coffee cup rotation tensor as measured by you.
\[ \begin{equation*} \begin{split} {R^Y} \cdot {^BR_{C_1}} &= {^BR_{C_2}} \\ {R^Y} &= {^BR_{C_2}} \cdot {(^BR_{C_1})^T} \\ {R^Y} &= \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \\ {R^Y} &= \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \end{split} \end{equation*} \]
Great! Now you have a matrix that allows you to rotate the coffee cup from in front of you to your right side. You're very close to sharing how to properly drink coffee with the world. To help test this method, you decide to enlist the help of your friend. You tell your friend to establish a coordinate system centered at the shoulder joint, one centered at the center of gravity of the mug, and to then apply the matrix ${R^Y}$ defined above. You indicate that it is important that in state 1 the coffee mug is held upright, directly in front of her - so as not to spill the coffee - and that the mug handle faces to her right. Your friend goes through the same procedure as you, but she prefers to represent the up direction by the third unit vector. So, she defines the shoulder joint basis as \(S=\{\vec{s_1}, \vec{s_2}, \vec{s_3}\}\):
- \(\vec{s_1}\) to point to the right
- \(\vec{s_2}\) to point forward
- \(\vec{s_3}\) to point up
She does as you say and attaches a frame to the mug \(M=\{\vec{m_1}, \vec{m_2}, \vec{m_3}\}\). Not directed by you, she also decides to make the orientation of the mug in state 1 the identity matrix because it makes computations easier.
- \(\vec{m_1}\) points toward the handle (right)
- \(\vec{m_2}\) points toward the front of the mug (forward)
- \(\vec{m_3}\) points toward the top of the mug (up)
\[ ^SR_{M_1} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]
Now, to test your matrix!
\[ \begin{equation*} \begin{split} {^SR_{M_2}} &= {R^Y} \cdot {^SR_{M_1}} \\ {^SR_{M_2}} &= \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \\ {^SR_{M_2}} &=\begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \end{split} \end{equation*} \]
You friend is very upset! She ended up with spilled coffee all over her floor. What happened? \(^SR_{M_2}\) indicates that the coffee mug is tilted to the left: the handle points up, and the top (mouth) points directly to the left. This example hopefully provides a basis for understanding tensors. Please remember that all preceding and proceeding statements only apply to orthogonal bases.
This coffee tragedy can be explained by the fact that the matrix \({R^Y}\) represents a 90° clockwise rotation around the 2nd basis vector. In your shoulder coordinate system the 2nd basis vector points up, leading to the desired coffee mug rotation. In your friend's coordinate system the 2nd basis vector points forward, leading to spilled coffee. Many solutions exist to this existential crisis:
- Specify how to establish the shoulder and cup/mug coordinate systems upfront.
- Do not specify the cup/mug coordinate system and specify that a 90° clockwise rotation must happen around the gravitational axis.
- Use tensor transformation laws.
If this were the only engineering problem that I needed to solve, I would use the 2nd option. However, in more complex problems, the 3rd option is the only practical solution. Keeping that in mind, let's explore tensors and their transformation laws. What we seek is a tensor that performs the proper coffee mug rotation procedure in any coordinate system. So far, what we have is a matrix that performs the proper rotation procedure only in your shoulder coordinate system (more generally in a coordinate system where the 2nd basis vector points up). What if we could relate your shoulder coordinate system to your friend's shoulder coordinate system? Maybe that will help.
\[ \begin{equation} ^SR_B = \begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \end{equation} \]
Here's an idea, to find the matrix representation of the coffee mug rotation tensor, in your friend's coordinate system, we can perhaps do the following:
- Transform from her coordinate system to yours
- Apply \({R^Y}\) since we know that it works in your coordinate system
- Transform back to her coordinate system
\[ \begin{equation*} \begin{split} {^SR_{M_2}} &= {^SR_B} \cdot {R^Y} \cdot {^BR_S} \cdot {^SR_{M_1}} \\ {^SR_{M_2}} &= {^SR_B} \cdot {R^Y} \cdot {(^SR_B)^T} \cdot {^SR_{M_1}} \\ {^SR_{M_2}} &= \begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \\ {^SR_{M_2}} &=\begin{bmatrix} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \end{split} \end{equation*} \]
You friend can now rotate her coffee mug without spilling it, and the coffee mug ends up in the exact same orientation as yours! We can now define the same coffee mug rotation tensor in her coordinate system (\(R^F\), stands for coffee mug rotation tensor as measured by your friend):
\[ \begin{equation} \begin{split} {R^F} &= {^SR_B} \cdot {R^Y} \cdot {(^SR_B)^T}\\ &= \begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 & 0 & -1 \\ 0 & 1 & 0 \\ 1 & 0 & 0 \end{bmatrix} \cdot \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{bmatrix} \\ &=\begin{bmatrix} 0 & 1 & 0 \\ -1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \end{split} \end{equation} \]
\(R^Y\) and \(R^F\) are the exact same tensor expressed in different coordinate systems. The expression above for expressing a tensor in different coordinate systems may seem strange, but hopefully it now makes sense. The same principles apply to higher-order tensors and tensors defined in non-orthogonal coordinate systems. However, the mathematical mechanics become more complex. For example, if your friend decided to establish a non-orthogonal coordinate system, then we could not use the transpose operation as above, but the inverse operation would be necessary. Regardless, I hope this post helps you understand why a tensor expresses a linear (or multilinear) operation - such as a rotation - in a way that is coordinate system invariant.