March 5, 2021
Beyond Calculus
Tensors Are The Facts Of The Universe — Lillian Lieber
As the popular saying goes, necessity is the mother of invention — and when it comes to math, nature never ceases to provide us with a source for this necessity. Any experienced engineer can confirm: a very real necessity for engineers of all disciplines is to precisely model physical objects & quantities, regardless of perspective or frame of reference.
Assuming you’re familiar with nothing further than pre-calculus, your mental model for graphing, & in general geometrically representing objects, goes to the good ole’ Cartesian Coordinate system. With it’s ever familiar X & Y axis that intersect at an origin, this system has provided the foundation for mathematically representing space up to this point. Even in early sciences like Newtonian physics, we apply linear motion equations to observe the path of a thrown ball or shot projectile within this convenient space. And staying within the confines of this theoretical, orthogonal space serves students very well as a learning tool.
However, these perfectly-symmetrical, centered Cartesian Coordinates are theoretical learning tools — they’re of little use when it comes to real objects.
When mathematicians, engineers or physicists model an object in real life, there are a handful of undeniably fixed properties that are independent of any type of coordinate system. To show this, let’s start simple, by describing the theoretical temperature (K) at a certain point (P) in a square room represented by an XY-plane. Below, we represent this same point (P) using two different origins, or coordinate systems:
Regardless of the coordinate system we use to represent our point (P), the physical expression of the object, temperature, in this case, should be the same. But let’s take that thought experiment further with a physical object that has both direction & length, say, acceleration. Now let’s imagine we drop some object (K) from the top of a building; when K is dropped, the acceleration vector (A) acts on it. Using the same principle as the previous example, we can also graph this scenario in multiple ways:
The position variables (K), represented by (x,y) pairs changes but the acceleration vector (A) does not. The physical quantity of acceleration expresses the same meaning independent of our coordinate system.
We can extend these examples to higher dimensions & the point remains the same: mathematicians & engineers need a way to geometrically represent physical quantities & understand how they behave under different coordinate systems.
With this necessity, eventually arrived the concept of tensors, the crux of this article. What follows is not the most strictly-accurate mathematical definition, but rather an intuitive ramp-up that serves as a great starting point for the rest of this light introduction:
Tensors are mathematical objects that are invariant under a change of coordinates & have components that change in predictable ways
Tensor analysis & it’s follow-up tensor calculus, all revolve around these types of IRL objects. From stress to conductivity to electromagnetism, tensors are inevitable in the higher up branches of STEM. Quite famously, & a personal motivator for learning tensors, Einstein’s (yes, that Einstein), General Relatively equation is written exclusively in terms of tensors. Again, there’s a more accurate, abstract definition that we’ll mention towards the end, but this will give us a starting off point as we walk-through the basics of Tensor math.
Before we move on, however, let’s discuss what tensors are not. Unfortunately, most advanced math is skipped over until it’s a need-to-have in the engineering toolkit; this, in turn, creates further confusions as each field introduces tensors with slightly different vernacular. Among others answers, if you Google “what are tensors” you’ll like come across the following:
— Tensors are containers of data
— Tensors are generalized matrices
Both of these have some truth to them but they’re unequivocally incorrect. Using a mental model of storage does capture the notion that tensors store a makeup of components, but it overlooks a key principle: tensor components follow specific behaviors under linear transforms. On the second definition, it’s true that the rest of this walk-through will include columns, rows, & matrices — but these are merely ways of spatially organizing numbers; these are the tools of tensors, yet they underwhelm in capturing what tensors are.
With that out of the way, it’s time to break out some math & walk-through the very beginnings of tensors.
As you’ll shortly see, an introduction to tensors typically includes reviewing objects that most are already familiar with (rows, columns & matrices); this often leads to students glazing over or speed-reading through these reviews, which is a grave error. The field of tensors assigns new meanings to these objects & also introduces a plethora of new notation, both factors which greatly contributed to the confusion among newcomers.
In general, be warned: while the next few sections appear like a 101 linear algebra reviews, they’re most certainly not. We need to work our way through tensors, & in order to do so, instead of overwhelming by introducing all new notations, rules & meanings at once, we’ll bring them in piece-by-piece as we review familiar tools.
We can start by re-visiting the most simple example of a physical quantity: a scalar. As I’m sure you’ve heard numerous times, a scalar is simply a number with no direction, only magnitude — but are all scalars tensors? No.
Recall that the tool, a scalar, is not the definition. Temperature & magnitude? Yes, these scalars qualify as (0,0)-tensors or rank-0 tensors (we’ll formally define these notations later) since they represent the exact same meaning from all frames of reference (coordinate systems / basis vectors). But here’s an example of a scalar that is not a rank-0 tensor: light frequency. Why? Because your measurement is dependent on position — for example, whether you’re moving toward or away from the light would change the frequency measured. With this deeper clarification on what qualifies as a tensor & an informal introduction to one of the common notations (“rank-x tensor”), we’re good to move on to vectors.
Our first real look at a special-type of tensor is an object most are familiar with: good ole’ vectors. Vectors are a unique type of tensor (this will become clearer later) written strictly as column vectors of n-dimensions. Additionally, again, using notation that’ll become much clearer later, they’re also written as (1,0)-tensors or rank-1 tensors.
Let’s make sure we’re clear on a few concepts when it comes to vectors. First, if you’re not familiar or you forgot the terminology, we’re going to introduce the term basis vector. An intuitive way of thinking of basis vectors is to consider them as the equivalent of x,y,z, etc… axis. This property of vector basis is exactly why we can denote vectors as scalars multiplied by the individual vector basis components as follows: v = 5i + 6j + 4k. Here, i, j & k are the basis vectors — we can switch these out though.
Instead of using different variables (i,j,k) we can abstract this property out to any n-number of dimensions & simply replace each letter with x followed by a subscript: 5e1 + 6e2 + 4I3. Simplifying a step further, we realize that for any vector of n-dimension, instead of writing each scalar & vector basis term out, we can instead compactly write the vector as a sum. In general, we can then represent vectors with a sum notation followed by all scalars multiplied by all vector basis:
Moving forward, it’s very likely that you’ll come across similar notation as this is one of the two the standard notations for vectors in Tensor-land.
We’ve mentioned that a key property of tensors is that their meaning, the combination of the components, is invariant to coordinate changes, or transforms. So what exactly does this mean? Let’s again start by laying out some vector (v) laid out on the same space yet defined by two different sets of basis vectors, original (e1,e2) & alternative (~e1,~e2):
As we can see above, our original basis vectors (blue) line up to represent our comfortable, orthogonal coordinate system, while the alternate basis vectors (teal) creates a new, non-orthogonal vector space. The key takeaway here is that vector (orange) does not change under a change of basis. Before we analyze the components of V in both spaces, let’s first walk-through the math of how the basis vectors transform, from (e1,e2) to (~e1,~e2).
Forward Transform
Let’s ignore the vector v for a moment & just focus on the two sets of vector basis — how can we mathematically represent our alternative (~e1,~e2) basis vectors in terms of (e1,e2)? As it turns out, this is a straight-forward process, all we have to do is define each alternative basis vector as a sum of some scalar & our original basis vector (ei. ~e1 = ae1 + be2 & ~e2 = ce1 + de2). Shown below, all we did to calculate our scalars (a,b,c,d) is manually grab our basis vectors (e1,e2) & scaled them appropriately. Originally both 130pxs in length, we scaled & transformed our original basis vectors until they intersected with the alternative basis vectors& wrote them down as fractions with a common denominator of 5 (this is all in pixels but the units are irrelevant):
As seen on the bottom right of the image above, the result is a matrix with n-columns; these columns tell us how to move forward from our original basis to an alternative basis. Appropriately, this matrix is commonly described as a forward transform; we’ll denote it with a capital, bold F moving forward(another common representation is a capital S). But what if we wanted to go in reverse? From the alternative vector basis to our original?
Backward Transform
As the name suggest, there is a related transform, the Backward Transform (B) that does the exact opposite of the Forward Transform (F); and whenever we think inverse when working with matrices, the identity matrix (I)should come to mind. As implied, the basis vector Forward & Backward Transforms are related as follows: BF = I. We could manually measure out our original basis vectors but now as sums of our alternative basis vectors like we did above. Instead, however, let’s double-check our intuition by continuing our example algebraically — we have F & I, so let’s work through the steps to arrive at at the Backward Transform (B):
With four variables & four equations, the algebra above is tedious, but simple. Note, however, that we again introduced new notation — instead of using random letters as our variables for the component of the backward transform, we’re using the letter “B” (for “backward”) with both subscripts & superscripts. This is a common trip-up when learning about tensors: the superscripts are not power symbols, they indicate the components position within the matrix. In fact, moving forward, for the rest of this guide you can safely assume that all superscripts indicate position, not exponents. This notation will become more clear & powerful as we move on, but for now it’s worth noting that the superscripts (above “B”) indicate columns while the subscripts refer to rows.
Working through the algebra, we arrive at the four components on the right that together create the Backwards Transform. Beautiful. With both of these transforms we can seamlessly switch from our original basis vectors to our alternative basis vectors; we can summarize this relationship with the following ~fancy~ sum notations:
To transform from our original basis to our alternative basis, we multiply the original with the Forward Transform; to transform from our alternative basis to our original basis, we multiply the alternative with the Backward Transform.
Excellent, we know how the basis vectors transform, now let’s break down how the vector components transform — how do the components of our example v vector (v) differ under the two different sets of basis vectors (e1,e2, & ~e1,~e2)? How do they behave during the transform? Below, we again show our same vector in the two different spaces, this time though, we’re highlighting the vector components as sums of their respective basis vectors:
Immediately it’s clear that the vector components ((1/2)*e1,(1/2)*e2) & (c*~e1,d*~e2) are different — but we already how they transform right? The Forward Transform worked for the basis vectors, so let’s simply apply the same transform to the components of the vector in the original basis space:
Something went wrong here. As we can see above, transforming the original component vectors with the Forward Transform did not return the same vector — the image on the right shows that we instead arrived at some new vector. To understand why this is, let’s carefully observe the movement of the component vectors through the transform:
Watching just the components of the vectors during our transform highlights something interesting: the components move in the opposite direction. They move “against” the basis transform, or, in more appropriate nomenclature, they’re contravariant. We’ll re-visit this term continuously as it’s a critical part of the Tensor dictionary; for now, all we’ve shown is the components of a vector under a basis transformation transform in the opposite direction. We’ll quickly double-check this by now multiplying our original vector components by the Backward Transform (instead of the assumed Forward Transform):
Finally, as we can see above, our original vector components transformed with the Backwards Transform spat out the same vector, but now in terms of our alternative basis vectors: v~e = (15/22)~e1 + (10/22)~e2.
We’ve already covered a decent amount of new vocabulary & notation, we’ll summarize this below before we move on to our next example of a special tensor: the covector.
Note* — From Wiki To Wolfram, the formal, right definition for each of the concepts defined below & in following reviews are a Google search away; for effectiveness, I’m introducing these terms in a much more informal/beginner-friendly manner but still highly encourage looking them up.
Basis Vector: the independent x,y,z…n axis for a given vector space. These basis vectors, usually denoted by the letter e, are what allows us to represent vectors as a sum of scalar multiplied by basis vectors.
Vectors: a special type of tensor represented strictly as columns & written as an aggregate sum of scalars multiplied by n basis vectors. Also known as contravariant vectors & (1,0)-tensors.
Sum Notation: a new way to write vectors as with the sum notation; they’re represented as an aggregate sum of scalars multiplied by n basis vectors.
Forward & Backward Transforms: the transform of vector components from an old basis vector to a new basis vector (or vice versa) — represented by matrices or with the letter F/S (Forward) & B/T (Backward). The dot product of the Forward & Backward transforms results in the identity matrix I:
Contraviarant Transformation: how the components of a vector transform under a change in basis; easy to remember by the name, they change contra/against the forward transform, this is reflected in the formula above.
Alright, admittedly the previous section was mainly a review with a dash of new topics. Unfortunately, the learning curve steepens in this next section as it introduces an entirely new special class of tensors with rather ambiguous terminology. For example, apart from “covector,” the objects we’ll cover here are also known as: (0,1)-tensors, 1-forms, covariant vectors, dual vectors, linear functors & functions. To assuage this learning curve, we’ll again start with application & examples, & eventually trend towards an abstract, but accurate understanding.
Reasonably, the lowest-hanging branch of conceptualization when it comes to covectors is the tool of choice: rows. In Tensor-land, vectors are strictly written as columns & co-vectors are strictly written as rows.
If you have some linear algebra background, it’s worth giving the disclaimer: transposing columns & rows is not allowed. Why? Because in real life basis vectors are almost never orthogonal; switching column & row vectors convey the same meaning & return the same vector only in the special occasion of an orthogonal plane.
Fine, so we know how covectors are written, but what exactly are they? Let’s first dissect the technical, right, albeit abstract definition of a covector/covariant vector/dual vector/1-form/(0,1)-tensor:
Given a vector space 𝑉, there is a “dual” space 𝑉∗ which consists of linear functions 𝑉→𝔽 (where 𝔽 is the underlying field) known as dual vectors. Given 𝑣∈𝑉,𝜙∈𝑉∗, we can plug in to get a number 𝜙(𝑣).
I struggled tremendously with this definition as its compactly packed with much abstractness, so let’s go through this piece-by-piece. Ignore the first clause (we’ll circle back) & go to “consists of linear functions 𝑉→𝔽 (where 𝔽 is the underlying field) known as dual vectors.” A really simple translation here is: a dual/co/covariant/1-form is a linear function that “eats” a vector (V) as an input & returns some scalar (F); the notation for this usually looks something like this: 𝜙(v) = c. Returning to the first clause, these linear functions & the resultant scalars do not belong to our vector space, but to a separate space known & written as the “dual space” V*. Hopefully that reduces some confusion on what covectors are:
Linear functions represented by rows in an array that input normal vectors & output scalars.
These scalars do not exist in our vector space V but rather in some other space which we’ll refer to as the “dual space.” We visualize normal (or contravariant) vectors of n-dimensions as arrows with direction & magnitude in our vector space — so how do we visualize covectors?
We can gain a visual understanding by drawing out an arbitrary covector (𝜙 = [2 1]) acting on our example vector space, not on any specific vector just yet:
All we’re doing above is exploring how an example covector (𝜙 = [2 1]) would look like when it spits out varying scalars. This is why referring to “covectors” as linear functions makes sense: they’re best visualized by a series of lines (2d), planes (2d) or hyperplanes. Instead of inserting any one specific vector into our covector (𝜙 = [2 1]), we drew out the general covector. To see how this specific covector (𝜙) interacts with our orange example vector (v), let’s now lay it over our cover:
Our diagram now represents our covector (𝜙) with the input of our vector (v): 𝜙 (v). Highlighted in yellow, the resultant scalar from 𝜙 (V) is the number of lines of crossed; in this specific example, we can count that our vector (v) crosses approximately ~1.5 lines, so we can say that 𝜙 (v) = 1.5. To further our visual understanding, we can double-check this algebraically:
We’ve now proven both algebraically & visually that 𝜙(v) = 1.5. With this experience under our belts, let’s re-visit the technical definition for a covector:
Given a vector space 𝑉, there is a “dual” space 𝑉∗ which consists of linear functions 𝑉→𝔽 (where 𝔽 is the underlying field) known as dual vectors. Given 𝑣∈𝑉,𝜙∈𝑉∗, we can plug in to get a number 𝜙(𝑣).
Let’s now switch our focus over to the “dual space V*” part of the definition. When it comes to vectors in a vector space V, we can write them out as sums of scalars & basis vectors: v= ae1 + be2 + ce3…so what’s the equivalent of writing out our basis covectors in the dual space V*? How do we derive them & what do they look like? Once we know this, we can finally explore how covector components behave under a change in basis.
Just like we have two basis vectors (e1,e2) & two alternative basis vectors (~e1,~e1), we can also write our our covector as the aggregate sum of scalars & basis covectors (also commonly known as the dual basis ); instead of e & ~e, we’ll define our dual basis components with sigma (ϵ1,ϵ2) variables. First, as a quick prerequisite & sneak peak however, we’ll need to learn about the famous Kronecker Delta.
Kronecker Delta
The Kronecker Delta is a special, compact function that tells us how vector & covector basis interact over the same index (it’s okay if this terminology is still a bit unclear):
The above tells us that that whenever we take the dot product of a vector basis & a covector basis, if they share the same index, the dot product equals 1, otherwise, it equals zero. It’s critical to note that up to this point we’ve strictly used subscripts (lowered indices) in our e & ~e basis vectors; this is because in Tensor-land:
Vectors/contravariant vectors are written with lowered indices, while covectors/covariant vectors are written with superscripts, or raised indices.
In the formula above, the superscript i refers to indices of basis vectors while the subscript j refers to indices of basis covectors. We can immediately apply the Kronecker Delta to help us derive the basis covectors (ϵ1,ϵ2). Take a moment to understand exactly what the KD implies: every vector & covector basis with the same index equals one, which means a covector line is interacted with at a single tangential point (though not always strictly perpendicular) like in breakdown of our original basis below:
The diagram above breaks down our covector into it’s respective dual basis; following the Kronecker Delta, we can see that each basis vector/covector pairs are perpendicular & that they combine to represent our dual basis: B = [1 1]ϵ or ϵ1 + ϵ2.
Great. But what about representing our specific example covector 𝜙? Our previous way of writing it as a row [2 1]ϵ still works; or, we can write it in a similar sum notation that we used for vectors — multiplying the dual basis by their appropriate scalars: 𝜙 = [2 1]ϵ =2ϵ1 + 1ϵ2. If we draw out both scalar/covector multiples, we can see that we indeed arrive at the graph of lines from a few paragraphs above:
Now with a basic grasp on covectors, their output, basis covectors & the dual space, we can finally turn our attention to the real topic du jour — how do covectors & their components behave under a transform?
In the previous section, we discovered that while basis vectors transform one way (with the Forward Transform), the underlying vector components transformed in a contravariant manner (with the Backward Transform); predictably, as the title & etymology implies, covectors transform in the opposite way:
Covector components transform with (or covariantly to) the change in basis — they follow the Forward Transform; the basis covectors, or dual basis, follow the Backward Transform.
We can best show this by continuing our example. We’ll first derive our alternate dual basis (~ϵ) using the Backward Transform; then, we’ll express our covector 𝜙 in terms of our alternate dual basis. To double-check that everything worked out, we’ll algebraically verify that 𝜙~ϵ (v~e) = 1.5; in other words, we’ll confirm, that that our covector acting on our vector returns the same value regardless of the basis it’s expressed in.
The components of a covector transform in the same direction as the transform to its dual basis; for us, that means if we want to express our covector 𝜙 in terms of ~ϵ, instead of ϵ, we need to get the dot product of 𝜙 & our Forward Transform — this is done below:
The output on the right expresses our covector 𝜙 in terms of our new basis; from the section on vectors above, we also already derived our example vector v in the new basis (as a reminder, our vector components changed from [.5,5.5]e to [15/22,10/22]~e). To wrap this section up, we can algebraically verify the spatial independence nature of tensors by once again passing our example vector v through our covector 𝜙, except with both tensors expressed in our new/updated basis. Recall that above we calculated that 𝜙 (v) = 1.5:
It worked! Our covector acting on our vector returned the exact same value regardless of whether the components were expressed in our original (blue arrows) or updated (green arrows) basis. This is a huge leap in our understanding of tensors because it highlights the invariant properties of tensors — while the components of both our vector & covector changed, their geometric & algebraic properties, the whole of their parts, preserved its meaning.
Covectors: a special type of tensor represented strictly as row & written as an aggregate sum of scalars multiplied by n dual basis vectors. Also known as contravariant vectors, (1,0)-tensors, 1-forms, covariant vectors, dual vectors, linear functors & functions:
Dual Basis Vector: the independent x,y,z…n axis for a given covector/dual space V*. These dual basis, usually denoted by the symbol ϵ, are what allows us to represent covectors as a sum of scalar multiplied by basis covectors — they’re usually derived algebraically with the Kronecker Delta. They follow the Backward Transform when updating with new basis vectors:
Kronecker Delta: a key formula that describes how variant & covariant components interact; often used to derive the dual basis given a set of basis vectors. new way to write vectors as with the sum notation; they’re represented as an aggregate sum of scalars multiplied by n basis vectors. Vectors/contravariant vectors are written with lowered indices, while covectors/covariant vectors are written with superscripts, or raised indices. Adherence to this notation is critical for further concepts.
Covariant Transformations: how the components of a covector transform under a change in basis; easy to remember by the name, they change co/with the forward transform, this is reflected in the formula below:
In addition to a simple (0,0-tensor, we’ve now reviewed two special types of tensors: vectors/(1,0)-tensors & covectors/(0,1)-tensors. As the metaphorical bow on top, we’ll wrap up this light intro by introducing a fourth, final type of tensor that’ll connect to the previous ones.
As the title suggests, this final type of tensor is known as a linear map. Much like our previous tensors, it also has an array of names such as linear transforms or (1,1)-tensors. Next, similar to the previous sections, we’ll introduce the tool of choice for representing & manipulating linear maps: matrices. Vectors are written as columns, covectors are written as rows, & linear maps are written as matrices.
Next, onto the output of a linear map; much like a covector takes in an input vector (v) & returns a scalar, a linear map takes in an input vector (v) & returns a new vector (w). Linear maps transform input vectors but they do not transform a basis — they simply map one vector to another in the same vector space.
Moving on from both our representation & output, let’s dive into the abstract definition likely found in a textbook:
A function 𝐿:𝐮→𝐯 is a linear map if for any two vectors 𝐮,𝐯 & any scalar c, the following two conditions are satisfied (linearity):
𝐿(𝐮+𝐯)𝐿(𝑐𝐮)=𝐿(𝐮)+𝐿(𝐯)=𝑐𝐿(𝐮)
Linear maps transform vectors, but they do not transform the basis; another , more-abstract way of thinking about linear maps is to consider them as spatial transforms. Explanations, no matter how formal or informal have limits in conveying meaning, so let’s go ahead & jump into our very last example.
Let’s imagine a linear map (L)e = {(1,-2),(5,-3)}. Below, we’ll apply this linear map (L) to our existing vector (v) to return some new vector (w):
In the images above, starting from the left, we first see our original vector v (.5,.5); in the middle section, we start with the linear map (L) which acts on our vector v to output a second vector w; in the final image to the right, we see the new vector w along with its components. Take a moment to walk through the algebra above; below, we extrapolate L(v) by breaking it up into terms of our basis vectors e1 & e2:
Defining our linear map in terms of our basis vectors shows us, once again, how this operation can be expressed with a series of sums.
Before distilling further, it’s now worth clarifying a point made all the way in the beginning: rows, columns, & matrices are tools used to represent matrices, they themselves are not tensors. For example, we’ve already used matrices earlier in this guide when they were used to represent Forward & Backward transforms — these were nothing but square arrays of real numbers, their definition depended on a specific choice of bases. Those were not tensors, yet they were matrices; linear maps are tensors represented by matrices. As we’ll see below, yes, the inner components of a linear map will change according to a new basis, but it’s geometric meaning will not, because, once again, tensors are invariant — they preserve their meaning.
Naturally, similar to the previous sections, we’re interested in exploring in-variance of these objects — aka how they behave under a basis transform. We already know what our vector (v) & L(v) or (w) look like on our original e basis, now, we’re going to define both L & (w) in terms of our new basis ~e.
In the very first section on vector/contravariant vectors, we worked through a change in basis (shown by the green-teal arrows with ~e) for our vector v = (15/22)~e1 + (10/22)~e2. By now it’s hopefully obvious that, like all previous tensors, we need to update their internal components whenever we have a change in basis — this includes our new L(v).
We want our linear map (L) to reflect the exact same transform v to w regardless of our change in basis; the matrix of numbers we used for (L), which were in our original e basis {(1,-2),(5,-3)}, are no longer accurate in our new basis ~e. So the question begs, how do the numbers in our linear map change in the new ~e basis? Walking through it all in sum notation, we’ll go through the algebra below:
We’ve now figured out how to update the components in our linear map to accommodate our change in basis! As you can see above, updating our linear map from an old basis to a new basis uses both a Forward & a Backward Transform; intuitively, this should make sense since a linear map, or a (1,1)-tensor has both contravariant & covariant components. Below, we’ll quickly double check our derived formula by implementing it with the example linear map from above. We’re specifically going to calculate:
~L — The components of our linear map in our new basis
~L(v~e) — The output of our linear map on our vector
We’ll have both a matrix that represents our linear map L in our new basis & the components to our output vector w in the new basis. If this all works out, then our vector w will look exactly as it does in the example above. Please refer to former diagrams to find F,B & L:
And we’re done! We’ve now updated our linear map according to the change in basis. If you look at the final column on the bottom right, we indeed got our transformed vector w written in terms of our updated basis vector (~e): w = (165/242)~e1 + (715/242)~e2.
To conclude this very last section, we’ll overlay the above on our original basis — highlighting, once again, the invariant property of tensors, in this case, a (1,1)-tensor, or a linear map:
Linear Maps: a special type of tensor represented strictly as matrices & written as an aggregate sum of scalar components multiplied by n basis vectors. Also known as linear transforms or (1,1)-tensors since they have both covariant & contravariant components:
Einstein Notation: we barely, briefly saw this when going through the algebra to determine how to transform our linear map. Basically, Einstein, yes, the Einstein, realized that as long as we’re careful about our super & subscript indices, we could drop the sigma sum notation & keep the same meaning. If/when we continue onward, this will be the standard notation for tensors — to assume a series of aggregate sums.
Linear Map Transformations: how the components of a linear map transform under a change in basis; absolutely critical to note that we use both the Forward & Backward Transform since linear maps, (1,1)-tensors, have both covariant & contravariant components:
*For sake of brevity I’ve excluded how we derived transforming from a linear map in the new basis to our old basis; as a slight hint for a self-directed exercise, observe that FB = KD, or that the dot product of the Forward & Backward Transform result in the Kronecker Delta.
We’ve now reviewed two unique types of tensors that together, acting as building blocks (vectors & covectors), allowed us to combined them to introduce a third type of tensor (linear maps). Throughout, we’ve continuously reminded ourselves that the power behind using tensors is to accurately represent objects in real life, regardless of our spatial perspective, property or transform.
Contravariant & covariant transformations, aka whether the components or the whole transform against or with basis changes, are really the driving mechanics behind tensors; the covariant components of are the perpendicular projections on the basis vectors, while the contravariant components are the parallel projections.
Denoted by a superscript & a subscript respectively, we also learned the special Kronecker Delta formula which provided us with an understanding for how co & contravariant transformations interact. Additionally, we picked up an entirely new system of notations — starting with representing vectors & covectors with a sum notation, then afterward introducing the sigma-less Einstein notation.
In short, this guide was certainly not concise, but it was all a prerequisite to finally introduce a much more abstract, yet accurate, general definition of a tensor. Vectors, covectors, & linear operators are all special cases of tensors, the best definition for a tensor is:
An Object That Is Invariant Under A Change Of Coordinates & Has Components That Change In A Special Predict Way
We’ve already seen a sneak peek of it, but of course, the follow up question is exactly what way is it that tensor components transform? The following formula looks incredibly intimating, which is why we’ve left it until now, but trust that we’ve learned everything needed to fully understand it:
We know that the F & B stand for Forward & Backward Transforms, the T, appropriately, stands for Tensor. Let’s isolate that T momentarily & tie its notation back to one of very few times we introduced indices & the (m,n) — tensor standard. Below, we finally come full circle:
Now, let’s look back at the general tensor formula. A tensor is an object with (m,n) amount of contra & covariant components; to figure out how any tensor of (m,n) dimensions changes under a basis transform, all we have to do is follow the general rule set above. All the coordinates stored upstairs are the contravariant coordinates, all the coordinates stored downstairs are the covariant coordinates. Just based on how many (m,n) components any random tensor contains, we can now immediately predict how it’ll change under a basis transform.
Looking back at what we presented in the preceding sections, we can now (hopefully) fully understand why a contravariant vector is in fact a (1,0)-tensor, a covariant vector is a(0,1)-tensor & a linear map is a (1,1)-tensor. At any point P in a generalized system, with an associated set of local axes & coordinate surfaces, we can specify two related but distinct sets of unit vectors: (1.) a set tangent to the local axes, & (2.) another set perpendicular to the local coordinate surfaces. Known as basis vectors & dual basis, just with this basic information we can fully predict how geometric object will behave under a spatial transform.
Tensors, tensor analysis & tensor calculus takes account of coordinate independence & of the peculiarities of different kinds of spaces in one grand sweep. Its formalisms are structurally the same regardless of the space involved, the number of (m,n)-dimensions, & so on. For this reason, tensors are very effective tools in the hands of theorists working in advanced studies — a power tool indeed.
Before we completely close out, I’d like to circle back to the very opening of this guide: the motivation. Why bother learning about tensors in the first place?
Well, you’ll evantually run into applied tensors in advanced branches of physics & engineering; however, my personal motivation started down this path because Einsteins eminent formula, work, & understanding of our universe is written in tensors —yes, his general relativity formula in particular is written in tensors:
Every variable with two subscripts is in fact a tensor. With this guide you’re now much more able to Google & work through each term. Take particular note of the gUV term above, just like the linear map is a combination of covariant & contravariant components, this g is also a special, yet common type of tensor known as the metric tensor. In fact, I purposely left out the metric tensor because it’s so important that it merits it’s own follow-up; as a sneak preview, the metric tensor is the tool that helps us define distance in any vector space — another incredibly useful tool, especially when modeling gravity.
If you’ve stuck all the way though I want to say a final note of gratitude: thank you, this piece took out more of me than I signed up for — I hope you find it useful or at least interesting. Now looking forward to writing up at least three follow-ups on: the metric tensor, all types of tensor products, & finally, general relativity .
Principles Of Tensor Calculus - Taha Sochi
Vector & Tensor Analysis With Applications - Borisenko, Tarapov, Silverman