The myth of no prime formula?

Terence Tao claims:

For instance, we have an exact formula for the nth square
number – it is n2 – but we do not have a (useful) exact
formula for the nth prime number pn!
“God may not play dice with the universe, but something
strange is going on with the prime numbers.” (Paul Erdős,

However there exist an exact formula for the nth prime

A double sum for the nth prime pn is pn=1+2(nlnn+1)k=1[1kj=21+s(j)n], where s(j)js=1(jsj1s)2j

(See Prime formulas at MathWorld.) There exist even an exact formula for prime counting function π(x). So why are mathematicians trying to prove the Riemann hypothesis to find a better estimate for π(x) and pn when they have those exact formulas?


If there is a mathematical body of knowledge that allows you to manipulate formulas (13) and (14) into a better form, then great! The Riemann zeta function has already proved its worth in this field. It was instrumental in the proofs by Hadamard and de la Vallée-Poussin of the Prime Number Theorem. What allowed that to happen was that the zeta function is a meromorphic function, and so complex variable theory could be applied to its study. One never knows with certainty whether further study of the zeta function will pay off, but people are betting that it will, based on past successes.

I don’t know whether anyone is promoting (13) and (14) as useful tools in the study of prime numbers, but if so, the burden is on those people to exhibit mathematical tools that allow one to manipulate (13) and (14) effectively.

Because of all the floor functions, (13) and (14) don’t have very nice analytical properties. As other posters have pointed out, they encode an elementary algorithm for identifying primes. In particular, the expression
that appears in the numerator of (14) equals 1 if s evenly divides j and equals 0 if it does not. Summing this quantity for s from 1 to j gives you the number of divisors of j. Prime numbers have exactly two divisors; composite numbers all have at least three. The numerator in (14) is therefore 0 when j is prime, and positive when it is composite. Dividing by j and multiplying by 1 results in the quantity s(j), which is 0 when j is prime and lies in the interval (1,0) when j is composite. The floor of s(j) therefore equals 0 when j is prime and 1 when j is composite. Adding 1 to s(j) gives the characteristic function of the primes.

Notice that using the formula (14) requires one to do more divisions than even a naive brute-force algorithm to compute the number of divisors would require. That price might be worth paying if the formula had nice mathematical properties that allowed one to extract additional information from it, but I don’t see that this has been demonstrated. To me, it looks like a cumbersome arithmetic encoding of an inefficient algorithm. I would be happy to be proved wrong about this!

Formula (13) likewise looks like a cumbersome arithmetic encoding of a brute-force method for computing the nth prime, and seems likely to require more calculation than state-of-the-art algorithms. Again, it would be nice if someone could show how to extract useful information from the formula, but it doesn’t look very promising.

Added: I think it worth pointing out that some non-trivial analytic number theory enters into formula (13) in the choice of upper bound on the summation over k. You need to ensure that the pn lies in the range of summation, but that p2n does not. The reason is that the summand equals 1 when k<pn, equals 0 for pnk<p2n, and becomes negative for kp2n. So you need the summation over k to include all pn1 terms where the summand equals 1, but none of the terms where it is negative. This will ensure that the sum evaluates to pn1.

It appears to me that you need something like (3.12) and (3.13) in this paper of Rosser and Schoenfeld to justify the upper bound. Formula (3.12) is Rosser's Theorem which Dietrich Burde's answer to this post suggests requires deep knowledge of zeta function zeros. Interestingly, the author of your formulas (13) and (14) (Sebastián Martín Ruiz) does not rigorously justify the upper bound in his article, giving only a heuristic argument and and stating that it is “very probable” that the necessary inequalities are satisfied.

Perhaps one could play with the constant in front of nlogn and try to use Chebyshev's bounds (Theorem 3.1 here) instead of Rosser's Theorem. (The derivation of Chebyshev's bounds doesn't require knowledge of zeta function zeros.) But this seems possibly rather delicate, and may not work.

Second addition: Some discussion of the formula, and Ruiz's view of it, can be found here.

Source : Link , Question Author : user177691 , Answer Author : Community

Leave a Comment