# Why does this “miracle method” for matrix inversion work?

Recently, I answered this question about matrix invertibility using a solution technique I called a “miracle method.” The question and answer are reproduced below:

Problem: Let $$AA$$ be a matrix satisfying $$A^3 = 2IA^3 = 2I$$. Show that $$B = A^2 – 2A + 2IB = A^2 - 2A + 2I$$ is invertible.

Solution: Suspend your disbelief for a moment and suppose $$AA$$ and $$BB$$ were scalars, not matrices. Then, by power series expansion, we would simply be looking for
$$\frac{1}{B} = \frac{1}{A^2 – 2A + 2} = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots \frac{1}{B} = \frac{1}{A^2 - 2A + 2} = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots$$
where the coefficient of $$A^nA^n$$ is
$$c_n = \frac{1+i}{2^{n+2}} \left((1-i)^n-i (1+i)^n\right). c_n = \frac{1+i}{2^{n+2}} \left((1-i)^n-i (1+i)^n\right).$$
But we know that $$A^3 = 2A^3 = 2$$, so
$$\frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A}{4}-\frac{A^2}{4} + \cdots \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A}{4}-\frac{A^2}{4} + \cdots$$
and by summing the resulting coefficients on $$11$$, $$AA$$, and $$A^2A^2$$, we find that
$$\frac{1}{B} = \frac{2}{5} + \frac{3}{10}A + \frac{1}{10}A^2. \frac{1}{B} = \frac{2}{5} + \frac{3}{10}A + \frac{1}{10}A^2.$$
Now, what we’ve just done should be total nonsense if $$AA$$ and $$BB$$ are really matrices, not scalars. But try setting $$B^{-1} = \frac{2}{5}I + \frac{3}{10}A + \frac{1}{10}A^2B^{-1} = \frac{2}{5}I + \frac{3}{10}A + \frac{1}{10}A^2$$, compute the product $$BB^{-1}BB^{-1}$$, and you’ll find that, miraculously, this answer works!

I discovered this solution technique some time ago while exploring a similar problem in Wolfram Mathematica. However, I have no idea why any of these manipulations should produce a meaningful answer when scalar and matrix inversion are such different operations. Why does this method work? Is there something deeper going on here than a serendipitous coincidence in series expansion coefficients?

The real answer is the set of $n\times n$ matrices forms a Banach algebra – that is, a Banach space with a multiplication that distributes the right way. In the reals, the multiplication is the same as scaling, so the distinction doesn’t matter and we don’t think about it. But with matrices, scaling and multiplying matrices is different. The point is that there is no miracle. Rather, the argument you gave only uses tools from Banach algebras (notably, you didn’t use commutativity). So it generalizes nicely.
This kind of trick is used all the time to great effect. One classic example is proving that when $\|A\|<1$ there is an inverse of $1-A$. One takes the argument about geometric series from real analysis, checks that everything works in a Banach algebra, and then you're done.