An Introduction to Tensors

As a physics student, I’ve come across mathematical objects called tensors in several different contexts. Perhaps confusingly, I’ve also been given both the mathematician’s and physicist’s definition, which I believe are slightly different.

I currently think of them in the following ways, but have a tough time reconciling the different views:

  • An extension/abstraction of scalars, vectors, and matrices in mathematics.
  • A multi-dimensional array of elements.
  • A mapping between vector spaces that represents a co-ordinate independent transformation.

In fact, I’m not even sure how correct these three definitions are. Is there a particularly relevant (rigorous, even) definition of tensors and their uses, that might be suitable for a mathematical physicist?

Direct answers/explanations, as well as links to good introductory articles, would be much appreciated.


At least to me, it is helpful to think in terms of bases.
(I’ll only be talking about tensor products of finite-dimensional vector
spaces here.)
This makes the universal mapping property that Zach Conn talks about
a bit less abstract (in fact, almost trivial).

First recall that if $L: V \to U$ is a linear map, then $L$ is completely determined
by what it does to a basis $\{ e_i \}$ for $V$:
$$L(x)=L\left( \sum_i x_i e_i \right) = \sum_i x_i L(e_i).$$
(The coefficients of $L(e_i)$ in a basis for $U$ give the $i$th column
in the matrix for $L$ with respect to the given bases.)

Tensors come into the picture when one studies multilinear maps.
If $B: V \times W \to U$ is a bilinear map, then $B$
is completely determined by the values $B(e_i,f_j)$ where
$\{ e_i \}$ is a basis for $V$
and $\{ f_j \}$ is a basis for $W$:
$$B(x,y) = B\left( \sum_i x_i e_i,\sum_j y_j f_j \right) = \sum_i \sum_j x_i y_j B(e_i,f_j).$$
For simplicity, consider the particular case when $U=\mathbf{R}$;
then the values $B(e_i,f_j)$
make up a set of $N=mn$ real numbers (where $m$ and $n$ are the
dimensions of $V$ and $W$), and these numbers are all that we need to keep
track of in order to know everything about the bilinear map $B:V \times W \to \mathbf{R}$.

Notice that in order to compute $B(x,y)$ we don’t really need to know the
individual vectors $x$ and $y$, but rather the $N=mn$ numbers $\{ x_i y_j \}$.
Another pair of vectors $v$ and $w$ with $v_i w_j = x_i y_j$ for all $i$ and $j$
will satisfy $B(v,w)=B(x,y)$.

This leads to the idea of splitting the computation of $B(x,y)$ into two stages.
Take an $N$-dimensional vector space $T$ (they’re all isomorphic so it doesn’t matter
which one we take) with a basis $(g_1,\dots,g_N)$.
Given $x=\sum x_i e_i$ and $y=\sum y_j f_j$,
first form the vector in $T$
whose coordinates with respect to the basis $\{ g_k \}$ are given by the column vector
$$(x_1 y_1,\dots,x_1 y_m,x_2 y_1,\dots,x_2 y_m,\dots,x_n y_1,\dots,x_n y_m)^T.$$
Then run this vector through the linear map $\tilde{B}:T\to\mathbf{R}$ whose matrix
is the row vector
where $B_{ij}=B(e_i,f_j)$.
This gives, by construction, $\sum\sum B_{ij} x_i y_j=B(x,y)$.

We’ll call the space $T$ the tensor product of the vector spaces $V$ and $W$
and denote it by $T=V \otimes W$;
it is “uniquely defined up to isomorphism”,
and its elements are called tensors.
The vector in $T$ that we formed from $x\in V$ and $y\in W$ in the first stage above
will be denoted $x \otimes y$;
it’s a “bilinear mixture” of $x$ and $y$ which doesn’t allow us to
reconstruct $x$ and $y$ individually,
but still contains exactly all the information needed
in order to compute $B(x,y)$ for any bilinear map $B$;
we have $B(x,y)=\tilde{B}(x \otimes y)$.
This is the “universal property”; any bilinear map $B$ from $V \times W$
can be computed by taking a “detour” through $T$, and this detour
is unique, since the map $\tilde{B}$ is constructed uniquely from
the values $B(e_i,f_j)$.

To tidy this up, one would like to make sure that the definition is
basis-independent. One way is to check that everything transforms
properly under changes of bases. Another way is to do the construction
by forming a much bigger space and taking a quotient with respect to
suitable relations (without ever mentioning bases).
Then, by untangling definitions, one can for
example show that a bilinear map $B:V \times W \to \mathbf{R}$ can be
canonically identified with an element of the space $V^* \otimes W^*$,
and dually an element of $V \otimes W$ can be identified with a
bilinear map $V^* \times W^* \to \mathbf{R}$.
Yet other authors find this a convenient starting point, so that they
instead define $V \otimes W$ to be the space of bilinear maps $V^*
\times W^* \to \mathbf{R}$.
So it’s no wonder that one can become a little confused when trying
to compare different definitions…

Source : Link , Question Author : Noldorin , Answer Author : Hans Lundmark

Leave a Comment