Exceptional lattices and discrepancy?

Some linear lattices combine very good properties (e.g. highest density, maximum kissing number, etc.) like the hexagonal lattice (in 2-dimensional space R2), the E8 (in R8) or the famous Leech lattice (=E24, in R24). However, not in all dimensions such extraordinary good lattices are known (only in d=1,2,8,24); and this is for several good geometrical reasons, which become clear when inspecting these very good lattices and “good” dimensions. So we know why just 1,2,8,24 are so special.

For Monte-Carlo integration usually random samples are used (usually standard-uniform samples, so X in (0,1)d). However, for variance reduction, we can also use N d-dimensional lattice points X=X1,X2,…Xn). But for MC the discrepancy D matters mostly regarding integration error (not kissing number, etc.). D gets worse for higher dimensions e.g. roughly following (logN)^d/N. So for discrepancy there are almost no known exceptional “good” dimensions; actually only d=1 and 2 are (a bit) exceptional.

So I wonder: Is d=8 or d=24 also “special” regarding discrepancy or MC integration, e.g.:

  1. In those dimensions it might be easier to construct
    low-discrepancy sets?
  2. Or even a extraordinary low D can be achieved in those d?
  3. Or one can prove that even in these d no special characteristics exist regarding D?
  4. Actually (by far) not only D matters regarding integration error, but e.g. also the coverage in lower-dimensional projections, so maybe on these d=8 or 24 may give advantages (or not).

Something similar to #1 is often the case for prime numbers N. And in d=2 explicit constructions are available giving optimum D (e.g. using Fibonacci numbers). If #3 can be proven, then #2 is disproved, which would be helful too for me – although still #1 could be possible, offering at least practical advantages. A comment regarding #4: Each variable xi itself (1d-projection) can be made optimum distributed (like delta-x=1/N) by using latin hypercube sets. So if we could extend this to higher sub-dimensions by means of E8 or E24 it would be very helpful too (even if D is non-optimum). Note: Ignoring E8 and E24, one can improve LHS a bit by using methods like orthogonal LHS. In 2D the Fibonacci lattice is giving lowest discrepancy, and it can be created by rotating/shearing a Z2 lattice. I wonder if something similar is possible in d=8 as well?

Answer

Regarding the construction of low discrepancy sequences in $d$-dimensions, it is generally true to that there are no particularly special values of $d$.

Thus your 3rd hypothesis is closest to the truth.

However, the appearance that the lower dimensions are better/easier, is probably just a consequence of two effects:

  1. That the construction of low discrepancy sequences in 1-, 2- or
    3-dimensions has a vast range of applications in maths and science
    and thus are more studied and thus better known than higher dimensional sequences;
  2. That particular properties, such as discrepancy bounds, are generally easier to rigorously prove in the lower dimensions (or
    particular values of $d$).

There are many methods to construct high $d$-dimensional low discrepancy sequences. One way to categorize the different types is by the method of constructing their base (hyper)-parameters:

  • Irrational fractions: Kronecker, Richtmyer, Ramshaw
  • (Co)prime numbers: Van der Corput, Halton, Faure
  • Irreducible Polynomials : Niederreiter
  • Primitive polynomials: Sobol

As one moves to higher dimensions, the construction of the irreducible polynomials (Niederreiter) and primitive polynomials (Sobol) becomes more and more complex. However, from a pragmatic / programming perspective extending the Kronecker and Halton sequences to arbitrary dimensions, is trivial, as it generally only requires calculating the first $d$ prime numbers. The problem lies in the fact that these constructions are not provably optimal.

Kronecker Sequences

The Kronecker sequence is defined as,
$$ \mathbf{t}_{n+1} = \mathbf{t}_{n} + \pmb{\alpha} \; (\textrm{mod} \; 1) $$
for fixed constant vector $\pmb{\alpha} = (\alpha_1, \alpha_2, …, \alpha_d)$.

For the $d=1$ dimensional case, it is well known that $\alpha = \phi = \frac{1+\sqrt{5}}{2}$ is the provably best value.

A few lesser well known properties are worth noting.

Firstly, for any integers $a,b,c,d$ such that $|ad-bc|=1$,
$$\alpha = \frac{a +b \phi}{c+d \phi}$$ is also provably optimal.

Secondly, the more badly approximable the value of $\alpha$ is by rationals, the better the discrepancy sequence is. Given that the $\phi$ is the most badly approximable number, it stands that it is ‘the most irrational number’ and the optimal value for a low discrepancy sequence.

Springborn as well as Spalding give number theoretic reasons why $1+\sqrt{2}$ is the second most badly approximable number, and why $ \frac{1}{10} (9+\sqrt{221})$ is the third-most badly approximable number.

Extending this to higher dimensions requires articulating the concept of ‘badly approximable vectors’. Hensley outlines an argument on why a particular vector, based on the plastic constant is most likely the most badly approximable vector, and thus the most likely the optimal value for the construction of $d=2$ dimensional low discrepancy sequences.

Rus has created a beautiful and interactive visualization of the two dimensional sequence based on this value of the plastic constant.

For higher dimensions, the most common method for constructing $\pmb{\alpha}$ is to merely select the square root of the first $d$ prime numbers. This is an sensible and convenient choice which has been shown to be empirically quite good. However, there are no proofs on its optimality, or even near-optimality!

In summary, although it is extremely common to construct Kronecker sequences in arbitrarily high dimensions, there is no known value of $\pmb{\alpha}$ that is provably optimal, except for the trivial $d=1$ case.

Halton Sequence

Similarly, in the case of the Halton sequence, which is merely a component-wise concatenation of $d$ Van der Corput sequences, each constructed with a different base, $b_i$, such that the $b_i$ are pairwise coprime. In nearly all applications, again for convenience the $b_i$ are chosen to be the first $d$ prime numbers. Empirically, this has shown to give good results, however, again there is no proof that it is optimal.

An alternate low discrepancy sequence

I have written a detailed blog post, “The unreasonable effectiveness of quasirandom sequences“, which not only compares some of these sequences in higher dimensions, but also introduces a new sequence that is based on a generalization of the golden ratio. This sequence is

  • algebraically simpler
  • faster to compute
  • produces more consistent outputs
  • suffers less technical issues

than existing existing low discrepancy sequences, such as the Sobol, Niederreter, Halton and standard Kronecker sequences.

Furthermore, I give examples of how it (as well as other quasirandom low discrepancy sequences) are effective in higher-dimensional Monte Carlo numerical integration.

The solution is an additive recurrence method (modulo 1) which generalizes the 1-dimensional problem whose solution depends on the Golden Ratio. The solution to the $d$-dimensional problem, depends on a special constant $\phi_d$, where $\phi_d$ is the unique positive root of $ x^{d+1}=x+1$.

For $d=1$,  $ \phi_1 = 1.618033989… $, which is the canonical golden ratio.

For $d=2$, $ \phi_2 = 1.3247179572… $, which  is often called the plastic constant, and has some beautiful properties. This value was conjectured to most likely be the optimal value for a related two-dimensional problem [Hensley, 2002]. Jacob Rus has posted a beautiful visualization of this 2-dimensional low discrepancy sequence, which can be found here.

And finally specifically relating to your question, for $d=3$, $ \phi_3 = 1.2207440846… $

With this special constant in hand, the calculation of the $n$-th term is now extremely simple and fast to calculate:

$$ R: \mathbf{t}_n = \pmb{\alpha}_0 + n \pmb{\alpha} \; (\textrm{mod} \; 1),  \quad n=1,2,3,… $$
$$ \textrm{where} \quad \pmb{\alpha} =(\frac{1}{\phi_d}, \frac{1}{\phi_d^2},\frac{1}{\phi_d^3},…\frac{1}{\phi_d^d}), $$

Of course, the reason this is called a recurrence sequence is because the above definition is equivalent to
$$ R: \mathbf{t}_{n+1} = \mathbf{t}_{n} + \pmb{\alpha} \; (\textrm{mod} \; 1) $$

In nearly all instances, the choice of $\pmb{\alpha}_0 $ does not change the key characteristics, and so for reasons of obvious simplicity, $\pmb{\alpha}_0 =\pmb{0}$ is the usual choice. However, there are some arguments, relating to symmetry, that suggest that $\pmb{\alpha}_0=\pmb{1/2}$ is a better choice.

Although I provide empirical evidence that strongly suggests that it is an efficient and robust sequence, it does not yet come with proof.

Of course, one can see from the definition of this sequence, there is no particular value of $d$ that is more special than any other.

The Python code for constructing this sequence is

# Use Newton-Rhapson-Method to find the value of the root.
def gamma(d):
    x=1.0000
    for i in range(20):
        x = x-(pow(x,d+1)-x-1)/((d+1)*pow(x,d)-1)
    return x

d=17
n=1000

g = gamma(d)
alpha = np.zeros(d)                 
for j in range(d):
    alpha[j] = pow(1/g,j+1) %1
z = np.zeros((n, d))    
for i in range(n):
    z = (0.5 + alpha*(i+1)) %1
print(z)

Hope that helps!

Attribution
Source : Link , Question Author : user32038 , Answer Author : Martin Roberts

Leave a Comment