# What kind of matrices are non-diagonalizable?

I’m trying to build an intuitive geometric picture about diagonalization.
Let me show what I got so far.

Eigenvector of some linear operator signifies a direction in which operator just ”works” like a stretching, in other words, operator preserves the direction of its eigenvector. Corresponding eigenvalue is just a value which tells us for how much operator stretches the eigenvector (negative stretches = flipping in the opposite direction).
When we limit ourselves to real vector spaces, it’s intuitively clear that rotations don’t preserve direction of any non-zero vector. Actually, I’m thinking about 2D and 3D spaces as I write, so I talk about ”rotations”… for n-dimensional spaces it would be better to talk about ”operators which act like rotations on some 2D subspace”.

But, there are non-diagonalizable matrices that aren’t rotations – all non-zero nilpotent matrices. My intuitive view of nilpotent matrices is that they ”gradually collapse all dimensions/gradually lose all the information” (if we use them over and over again), so it’s clear to me why they can’t be diagonalizable.

But, again, there are non-diagonalizable matrices that aren’t rotations nor nilpotent, for an example:

$$\begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}$$

So, what’s the deal with them? Is there any kind of intuitive geometric reasoning that would help me grasp why there are matrices like this one? What’s their characteristic that stops them from being diagonalizable?

I think a very useful notion here is the idea of a “generalized eigenvector“.

An eigenvector of a matrix $A$ is a vector $v$ with associated value $\lambda$ such that
$$(A-\lambda I)v=0$$
A generalized eigenvector, on the other hand, is a vector $w$ with the same associated value such that
$$(A-\lambda I)^kw=0$$
That is, $(A-\lambda I)$ is nilpotent on $w$. Or, in other words:
$$(A – \lambda I)^{k-1}w=v$$
For some eigenvector $v$ with the same associated value.

Now, let’s see how this definition helps us with a non-diagonalizable matrix such as
$$A = \pmatrix{ 2 & 1\\ 0 & 2 }$$
For this matrix, we have $\lambda=2$ as a unique eigenvalue, and $v=\pmatrix{1\\0}$ as the associated eigenvector, which I will let you verify. $w=\pmatrix{0\\1}$ is our generalized eiegenvector. Notice that
$$(A – 2I) = \pmatrix{ 0 & 1\\ 0 & 0}$$
Is a nilpotent matrix of order $2$. Note that $(A – 2I)v=0$, and $(A- 2I)w=v$ so that $(A-2I)^2w=0$. But what does this mean for what the matrix $A$ does? The behavior of $v$ is fairly obvious, but with $w$ we have
$$Aw = \pmatrix{1\\2}=2w + v$$
So $w$ behaves kind of like an eigenvector, but not really. In general, a generalized eigenvector, when acted upon by $A$, gives another vector in the generalized eigenspace.

An important related notion is Jordan Normal Form. That is, while we can’t always diagonalize a matrix by finding a basis of eigenvectors, we can always put the matrix into Jordan normal form by finding a basis of generalized eigenvectors/eigenspaces.

I hope that helps. I’d say that the most important thing to grasp from the idea of generalized eigenvectors is that every transformation can be related to the action of a nilpotent over some subspace.