# What is the Riemann-Zeta function?

In laymen’s terms, as much as possible: What is the Riemann-Zeta function, and why does it come up so often with relation to prime numbers?

Suppose you want to put a probability distribution on the natural numbers for the purpose of doing number theory. What properties might you want such a distribution to have? Well, if you’re doing number theory then you want to think of the prime numbers as acting “independently”: knowing that a number is divisible by $p$ should give you no information about whether it’s divisible by $q$.

That quickly leads you to the following realization: you should choose the exponent of each prime in the prime factorization independently. So how should you choose these? It turns out that the probability distribution on the non-negative integers with maximum entropy and a given mean is a geometric distribution, as explained for example by Keith Conrad here. So let’s take the probability that the exponent of $p$ is $k$ to be equal to $(1 – r_p) r_p^k$ for some constant $r_p$.

This gives the probability that a positive integer $n = p_1^{e_1} … p_k^{e_k}$ occurs as

$\displaystyle C \prod_{i=1}^{k} r_p^{e_i}$

where $C = \prod_p (1 – r_p)$. So we need to choose $r_p$ such that this product converges. Now, we’d like the probability that $n$ occurs to be monotonically decreasing as a function of $n$. It turns out (and this is a nice exercise) that this is true if and only if $r_p = p^{-s}$ for some $s > 1$ (since $C$ has to converge), which gives the probability that $n$ occurs as

$\frac{ \frac{1}{n^s} }{ \zeta(s)}$

where $\zeta(s)$ is the zeta function.

One way of thinking about this argument is that $\zeta(s)$ is the partition function of a statistical-mechanical system called the Riemann gas. As $s$ gets closer to $1$, the temperature of this system increases until it would require infinite energy to make $s$ equal to $1$. But this limit is extremely important to understand: it is the limit in which the probability distribution above gets closer and closer to uniform. So it’s not surprising that you can deduce statistical information about the primes by studying the behavior as $s \to 1$ of this distribution.

Let me mention two other reasons to care about the limit as $s \to 1$ of the above distribution. First, the basic reason to think of the primes as acting independently is the Chinese Remainder Theorem. Second, a natural reason to look at a distribution where the probability that a number has exactly $k$ factors of $p$ is $(1 – p^{-1}) p^{-k}$ is that this is precisely the distribution you get on the residues $\bmod p^n$ for $k < n$. In fact, I believe this can be upgraded to the corresponding statement about Haar measure on the $p$-adic integers.