# The myth of no prime formula?

Terence Tao claims:

For instance, we have an exact formula for the $n^\text{th}$ square
number – it is $n^2$ – but we do not have a (useful) exact
formula for the $n^\text{th}$ prime number $p_n$!
“God may not play dice with the universe, but something
strange is going on with the prime numbers.” (Paul Erdős,
1913–1996)

However there exist an exact formula for the nth prime

A double sum for the nth prime pn is where

(See Prime formulas at MathWorld.) There exist even an exact formula for prime counting function $\pi (x)$. So why are mathematicians trying to prove the Riemann hypothesis to find a better estimate for $\pi(x)$ and $p_n$ when they have those exact formulas?

If there is a mathematical body of knowledge that allows you to manipulate formulas (13) and (14) into a better form, then great! The Riemann zeta function has already proved its worth in this field. It was instrumental in the proofs by Hadamard and de la Vallée-Poussin of the Prime Number Theorem. What allowed that to happen was that the zeta function is a meromorphic function, and so complex variable theory could be applied to its study. One never knows with certainty whether further study of the zeta function will pay off, but people are betting that it will, based on past successes.

I don’t know whether anyone is promoting (13) and (14) as useful tools in the study of prime numbers, but if so, the burden is on those people to exhibit mathematical tools that allow one to manipulate (13) and (14) effectively.

Because of all the floor functions, (13) and (14) don’t have very nice analytical properties. As other posters have pointed out, they encode an elementary algorithm for identifying primes. In particular, the expression

that appears in the numerator of (14) equals $1$ if $s$ evenly divides $j$ and equals $0$ if it does not. Summing this quantity for $s$ from $1$ to $j$ gives you the number of divisors of $j.$ Prime numbers have exactly two divisors; composite numbers all have at least three. The numerator in (14) is therefore $0$ when $j$ is prime, and positive when it is composite. Dividing by $j$ and multiplying by $-1$ results in the quantity $s(j),$ which is $0$ when $j$ is prime and lies in the interval $(-1,0)$ when $j$ is composite. The floor of $s(j$) therefore equals $0$ when $j$ is prime and $-1$ when $j$ is composite. Adding $1$ to $\lfloor s(j)\rfloor$ gives the characteristic function of the primes.

Notice that using the formula (14) requires one to do more divisions than even a naive brute-force algorithm to compute the number of divisors would require. That price might be worth paying if the formula had nice mathematical properties that allowed one to extract additional information from it, but I don’t see that this has been demonstrated. To me, it looks like a cumbersome arithmetic encoding of an inefficient algorithm. I would be happy to be proved wrong about this!

Formula (13) likewise looks like a cumbersome arithmetic encoding of a brute-force method for computing the $n^\text{th}$ prime, and seems likely to require more calculation than state-of-the-art algorithms. Again, it would be nice if someone could show how to extract useful information from the formula, but it doesn’t look very promising.

Added: I think it worth pointing out that some non-trivial analytic number theory enters into formula (13) in the choice of upper bound on the summation over $k.$ You need to ensure that the $p_n$ lies in the range of summation, but that $p_{2n}$ does not. The reason is that the summand equals $1$ when $k equals $0$ for $p_n\le k and becomes negative for $k\ge p_{2n}.$ So you need the summation over $k$ to include all $p_n-1$ terms where the summand equals $1$, but none of the terms where it is negative. This will ensure that the sum evaluates to $p_n-1.$

It appears to me that you need something like (3.12) and (3.13) in this paper of Rosser and Schoenfeld to justify the upper bound. Formula (3.12) is Rosser's Theorem which Dietrich Burde's answer to this post suggests requires deep knowledge of zeta function zeros. Interestingly, the author of your formulas (13) and (14) (Sebastián Martín Ruiz) does not rigorously justify the upper bound in his article, giving only a heuristic argument and and stating that it is “very probable” that the necessary inequalities are satisfied.

Perhaps one could play with the constant in front of $\lfloor n\log n\rfloor$ and try to use Chebyshev's bounds (Theorem 3.1 here) instead of Rosser's Theorem. (The derivation of Chebyshev's bounds doesn't require knowledge of zeta function zeros.) But this seems possibly rather delicate, and may not work.

Second addition: Some discussion of the formula, and Ruiz's view of it, can be found here.