How would you solve this tantalizing Halmos problem?

$1-ab$ invertible $\implies$ $1-ba$ invertible has a slick power series “proof” as below, where Halmos asks for an explanation of why this tantalizing derivation succeeds. Do you know one?

Geometric series. In a not necessarily commutative ring with
unit (e.g., in the set of all $3 \times 3$ square matrices with real
entries), if $1 – ab$ is invertible, then $1 – ba$ is invertible. However
plausible this may seem, few people can see their way
to a proof immediately; the most revealing approach belongs
to a different and distant subject.

Every student knows that
$1 – x^2 = (1 + x) (1 – x),$
and some even know that
$1 – x^3 =(1+x +x^2) (1 – x).$
The generalization
$1 – x^{n+1} = (1 + x + \cdots + x^n) (1 – x)$
is not far away. Divide by $1 – x$ and let $n$ tend to infinity;
if $|x| < 1$, then $x^{n+1}$ tends to $0$, and the conclusion is
$\frac{1}{1 – x} = 1 + x + x^2 + \cdots$.
This simple classical argument begins with easy algebra,
but the meat of the matter is analysis: numbers, absolute
values, inequalities, and convergence are needed not only
for the proof but even for the final equation to make

In the general ring theory question there are no numbers,
no absolute values, no inequalities, and no limits –
those concepts are totally inappropriate and cannot be
brought to bear. Nevertheless an impressive-sounding
classical phrase, “the principle of permanence of functional
form”, comes to the rescue and yields an analytically
inspired proof in pure algebra. The idea is to pretend
that $\frac{1}{1 – ba}$ can be expanded in a geometric series (which
is utter nonsense), so that
$(1 – ba)^{-1} = 1 + ba + baba + bababa + \cdots$
It follows (it doesn’t really, but it’s fun to keep pretending) that
$(1 – ba)^{-1} = 1 + b (1 + ab + abab + ababab + \cdots) a.$
and, after one more application of the geometric series
pretense, this yields
$(1 -ba)^{-1} = 1 + b (1 – ab)^{-1} a.$

Now stop the pretense and verify that, despite its unlawful
derivation, the formula works. If, that is, $ c = (1 – ab)^{-1}$,
so that $(1 – ab)c = c(1 – ab) = 1,$ then $1 + bca$ is the inverse
of $1 – ba.$ Once the statement is put this way, its
proof becomes a matter of (perfectly legal) mechanical

Why does it all this work? What goes on here? Why
does it seem that the formula for the sum of an infinite
geometric series is true even for an abstract ring in which
convergence is meaningless? What general truth does
the formula embody? I don’t know the answer, but I
note that the formula is applicable in other situations
where it ought not to be, and I wonder whether it deserves
to be called one of the (computational) elements
of mathematics. — P. R. Halmos [1]

[1] Halmos, P.R. Does mathematics have elements?
Math. Intelligencer 3 (1980/81), no. 4, 147-153


The best way that I know of interpreting this identity is by generalizing it:


Note that this is both more general than the original formulation (set $\lambda=1$) and equivalent to it (rescale). Now the geometric series argument makes perfect sense in the ring $R((\lambda^{-1}))$ of formal Laurent power series, where $R$ is the original ring or even the “universal ring” $\mathbb{Z}\langle a,b\rangle:$

$$ (\lambda-ba)^{-1}=\lambda^{-1}+\sum_{n\geq 1}\lambda^{-n-1}(ba)^n=\lambda^{-1}(1+\sum_{n\geq 0}\lambda^{-n-1}b(ab)^n a)=\lambda^{-1}(1+b(\lambda-ab)^{-1}a).\ \square$$

A variant of $(*)$ holds for rectangular matrices of transpose sizes over any unital ring: if $A$ is a $k\times n$ matrix and $B$ is a $n\times k$ matrix then

$$(\lambda I_n-BA)^{-1}=\lambda^{-1}(I_n+B(\lambda I_k-AB)^{-1}A).\qquad\qquad(**)$$

To see that, let $a = \begin{bmatrix}0 & 0 \\ A & 0\end{bmatrix}$ and $b= \begin{bmatrix}0 & B \\ 0 & 0\end{bmatrix}$ be $(n+k)\times (n+k)$ block matrices and apply $(*).\ \square$

Here are three remarkable corollaries of $(**)$ for matrices over a field:

  • $\det(\lambda I_n-BA) = \lambda^{n-k}\det(\lambda I_k-AB)\qquad\qquad\qquad$ (characteristic polynomials match)
  • $AB$ and $BA$ have the same spectrum away from $0$
  • $\lambda^k q_k(AB)\ |\ q_k(BA)\qquad\qquad\qquad\qquad\qquad\qquad\qquad $ (compatibility of the invariant factors)

I used a noncommutative version of $(**)$ for matrices over universal enveloping algberas of Lie algebras $(\mathfrak{g},\mathfrak{g’})$ forming a reductive dual pair in order to investigate the behavior of primitve ideals under algebraic Howe duality and to compute the quantum elementary divisors of completely prime primitive ideals of $U(\mathfrak{gl}_n)$ (a.k.a. quantizations of the conjugacy classes of matrices).


The identity $(1+x)(1-yx)^{-1}(1+x)=(1+y)(1-xy)^{-1}(1+x)$ mentioned by Richard Stanley in the comments can be easily proven by the same method: after homogenization, it becomes

$$(\lambda+x)(\lambda^2-yx)^{-1}(\lambda+y)= (\lambda+y)(\lambda^2-xy)^{-1}(\lambda+x).$$

The left hand side expands in the ring $\mathbb{Z}\langle x,y\rangle((\lambda^{-1}))$ as

$$1+\sum_{n\geq 1}\lambda^{-2n}(yx)^n+ \sum_{n\geq 0}\lambda^{-2n}(x(yx)^n+y(xy)^n)+ \sum_{n\geq 1}\lambda^{-2n}(xy)^n,$$

which is manifestly symmetric with respect to $x$ and $y.\ \square$

Source : Link , Question Author : Bill Dubuque , Answer Author : Victor Protsak

Leave a Comment