Why “characteristic zero” and not “infinite characteristic”?

The characteristic of a ring (with unity, say) is the smallest positive number $n$ such that provided such an $n$ exists. Otherwise, we define it to be $0$.

But why characteristic zero? Why do we not define it to be $\infty$ instead? Under this alternative definition, the characteristic of a ring is simply the “order” of the additive cyclic group generated by the unit element $1$.

My feeling is that there is a precise and convincing explanation for the common convention, but none comes to mind. I couldn’t find the answer in the Wikipedia article either.

There are two orderings of the set $\mathbb N = \{0,1,\dots\}$:

• magnitude $a \leq b$
• divisibility $a\mid b$ (i.e. $\exists c. b = a c$)

They are mostly compatible – usually when $a \mid b$, it holds $a \leq b$.

Some definitions are phrased using “greater than” ordering, while in fact the “divisibility” ordering is the real essence.

For example, the greatest common divisor of $a$ and $b$ might be defined as the greatest number which is a common divisor of both $a$ and $b$. Characteristic of a ring $R$ might be defined as smallest number $n>0$ which satisfies $n \cdot 1 = 0$.

Under such commonly taught definitions, it seems natural that $\operatorname{gcd}(0,0)=\infty$ and $\operatorname{char} \mathbb Z = \infty$.

However, those definitions implicitly rely on ideals, and are better phrased using divisibility order. The incompatibility is then more visible: $0$ is the largest element in divisibility order, while it is smallest in magnitude order. Magnitude has no largest element, and often $\infty$ is added to cover this case.

So let’s formulate the definitions again, but this time using divisibility ordering.

• The greatest common divisor of two numbers $a,b$ is greatest number (in sense of $\mid$) that is a divisor of $a$ and $b$ (i.e. is smaller than $a$ and $b$ in divisibility ordering). This is prettier – $\operatorname{gcd}$ is now the $\wedge$ operator in lattice $(\mathbb N, \mid)$; it also forms a monoid, with $0$ as identity element. Additionally, the definition can be adapted to any ring.
• The characteristic of a ring $R$ is the smallest number $n$ (in sense of $\mid$) that satisfies $n \cdot 1 =0$. As a bonus, compared to previous definition, we can remove the $n>0$ restriction: zero is always a valid “annihilator” but it is often not the smallest one. Now we get $\operatorname{char} \mathbb Z = 0$.

Characteristic is a “multiplicative” notion, like gcd. If you have a homomorphism of rings $f: A \to B$, it must hold $\operatorname{char} B \mid \operatorname{char} A$. For example, you cannot map ${\mathbb Z}_2$ to ${\mathbb Z}_4$ – in a sense, ${\mathbb Z}_2$ is “smaller” than ${\mathbb Z}_4$. “Bigger” rings have “more divisible” characteristic, their characteristics are greater in the sense of divisibility. And the “most divisible” number is 0. Another example is $\operatorname{char} A \times B = \operatorname{lcm}(\operatorname{char} A, \operatorname{char} B)$.

In a bit more abstract language: given any ideal $I \subseteq \mathbb Z$, we associate to it the smallest nonnegative element, under the divisibility order. By properties of $\mathbb Z$, every other element of $I$ is a multiple of it. Let’s call this number $\operatorname{min}(I)$.

We can now define $\operatorname{gcd}(a,b)=\operatorname{min} ((a) + (b))$, and $\operatorname{char} R = \min (\ker f)$, where $f \colon \mathbb Z \to R$ is the canonical map.

The definition of $\operatorname{min}(I)$ works for any PID, it does not require magnitude order. In any PID, $I = (\operatorname{min}(I))$.

(I dislike saying the ideal $\{0\}$ is “generated” by $0$; although this is true, it also generated by empty set. We do not say that $(2)$ is generated by $0$ and $2$.)