Random Number Generators for Precision Monte Carlos

Frederick James (CERN) 19th Sep 2020

The Monte Carlo method is used primarily for calculations that are too difficult or even impossible to perform analytically, so that as our experiments and theories become more accurate, we become more dependent on the quality of our MC calculations and hence on the random numbers they use. But how do we know if those numbers are random enough?

For many years pseudorandom number generators (RNG’s) were employed for large-scale Monte Carlo calculations. For many of them there was little or no reason to believe the numbers were actually random. The best ones had long periods, passed some empirical tests of randomness and most often appeared to give acceptable answers to Monte Carlo (MC) calculations, but, with one exception (RANLUX), all the widely used RNG’s were known to fail some empirical tests, indicating that they could produce incorrect results. Detecting incorrect results however requires knowing the exact answer to the problem, which we rarely know, so we cannot estimate how often our calculations are wrong because of the RNG, even if we know it can happen.

This unpleasant situation has been tolerated because there seemed to be no way out until we could find a RNG based on a theory of randomness which could guarantee at least a certain degree of independence in the random numbers. Such a theory had in fact been developed by mathematicians, mostly in the Soviet Union, to study the properties of classical mechanical systems which did not have analytical solutions. They developed a way to define different degrees of randomness known as Kolmogorov-Anosov mixing (KA mixing) and found the conditions under which classical mechanical systems could exhibit random behaviour.

It turns out to be very difficult to translate this randomness into a practical random number generator: fast, reproducible, unbiased, long period, and demonstrably random. Nevertheless, there are now four published algorithms for generating pseudorandom numbers based on KA mixing, and they are the only RNG algorithms we know of which have introduced the notion of randomness [1]. One of these RNG’s (RANLUX) has been used for many years and is already well known to physicists as the most reliable, albeit slow. The more recent appearance of additional algorithms based on the same theory gave us a better understanding of all of them. Working sometimes with the authors of these algorithms, Lorenzo Moneta from EP’s SFT group and I have determined under what conditions they can generate numbers that are demonstrably random in a sense that will be defined below. The results of our studies to date are published in Ref. 1.

This is an exciting time in the long history of random numbers because it is now possible for a non-specialist to understand how the best RNGs work, something we always assumed was beyond our abilities. In fact the reason we couldn't understand RNGs for a long time was because there was nothing to understand, they weren’t actually random at all. But now that there are RNG’s based on the theory of mixing in classical mechanical systems, we can see where the randomness comes from, and in a real sense measure the randomness.

But from where does the difficulty stem? This is a critical question to answer if we want to progress and develop robust random-number generators. We will try to briefly sketch an answer though this will require some mathematical formulation.

Dynamical Systems and the Ergodic Hierarchy

Let us start our discussion, with the representation of dynamical systems appropriate for our purposes like the following: x(t+1) = A x(t) mod 1 (eq.1), where x(t) is the vector of N real numbers [0,1) giving the position of the trajectory in phase space at time t, and A is a N × N matrix of integers.

The N-dimensional state space is a unit hypercube which because of the modulo function becomes topologically equivalent to an N-dimensional torus by identifying opposite faces. All the elements of A will be integers, and the determinant is one, which assures that the system is volume-conserving. Therefore matrix A is invertible and the inverse matrix is also a matrix of integers which produces the same sequence of vectors xi as equation 1 but in the opposite order. The theory is intended for describing systems which have no analytical solution, which implies a high but finite number of dimensions (1 << N < ∞).

The theory of mixing in dynamical systems is based on the Ergodic Hierarchy (EH) which allows us to classify systems according to their degree of mixing (a term used in EH theory), which is asymptotically the same as independence (a term used in statistics) or randomness (as encountered in quantum mechanics and required by the theory of MC).

The Ergodic Hierarch can be defined as follows: Let x_i and x_j represent the state of the dynamical system at two times i and j. Furthermore, let v₁ and v₂ be any two subspaces of the entire allowed space of points x, with measures (volumes relative to the total allowed volume) respectively µ(v₁) and µ(v₂). Then the dynamical system is said to be a 1-system (with 1-mixing) if P(x_i ∈ v1) = µ(v₁) and a 2-system (with 2-mixing) if P(x_i ∈ v₁ and x_j ∈ v₂) = µ(v₁)µ(v₂) for all i and j sufficiently far apart, and for all subspaces v_i. Similarly, an n-system can be defined for all positive integer values n.

We define a zero system as having the ergodic property, namely that the state of the system will asymptotically come arbitrarily close to any point in the state space. Putting all this together, we have that asymptotically, a system with zero-mixing covers the entire state space, a system with one-mixing covers uniformly a system with two-mixing has 2 × 2 independence of points, etc.

Finally, a system with n-mixing for arbitrarily large values of n is said to have K-mixing and is a K-system. It is a result of the theory that the degrees of mixing form a hierarchy (Berkovitz, Frigg and Kronz) that is, a system which has n-mixing for any value of n also has i-mixing for all i < n. Theory also tells us that a dynamical system represented by equation 1 will be a K-system if and only if the matrix A has eigenvalues λ, none of which lie on the unit circle in the complex plane.

From theory to computer algorithms

According to theory, we have only to find a matrix A of integers with the required properties: determinant=1 and eigenvalues away from the unit circle. Then equation 1 will produce the sequence of random numbers grouped in vectors of length N, and the numbers will be independent if we reject some vectors which are too close together. The theory doesn’t tell us the value of N, but any value much greater than 1 seems to work. It also doesn’t specify how many vectors should be rejected, and that will turn out to be a fundamental problem we will address below, the problem of decimation.

Additional problems will arise in going from the continuous theory to the necessarily discrete realisation on the computer. The first problem is that straightforward application of equation 1 requires N² multiplications to produce N random numbers, too slow for serious MC calculations. Then there is the period of the generator, which must be much longer than any sequence that will be used in any calculation. The random number generators based on KA Mixing and currently available are (in chronological order of their publication):

RANLUX by M. Lucher (1994), Ref.2
MIXMAX original by K. Savvidy (2015), Ref. 3
MIXMAX extended by K. Savvidy and G. Savvidy (2016), Ref. 4
RANLUX++ by A. Sibidanov (2017), Ref. 5

Each of these generators (or families of generators) uses its own special trick to solve the problems of speed and period. These are described in detail in [1] and the publications cited above, and require some advanced mathematics, but the reader does not need to know these details in order to understand the randomness which is all based on KA mixing. For the curious readers, a slightly more detailed discussion of these algorithms and the story behind their development can also be found HERE.

Thanks to intensive efforts over the past years, today, we have three different algorithms capable of generating numbers that are random to the level of 2-mixing, and therefore also 1-mixing and zeromixing, provided the appropriate levels of decimation are applied.

With these levels of decimation, the latest version of RANLUX single-precision is about as fast as MIXMAX (N=256, skip-5 decimation) double-precision. Where double precision is required, MIXMAX is about twice as fast. RANLUX++ is special because it requires assembler code which is of course not portable, but if portability is not a requirement, it is the fastest RNG to offer 2-mixing, and at the same time it can offer higher decimation with no penalty in speed.

We are currently investigating the possibility of attaining higher orders of mixing using higher levels of decimation, as suggested by the Shannon entropy content of the skipping factor a p mod m of RANLUX++. That probably implies finding a way of investigating the behaviour of a generator that is sensitive to the higher mixing and therefore the smaller eigenvalues (closer to one). We have therefore looked at RNG’s using the Savvidy matrix for which the ”magic integer” was purposely set to have an eigenvalue exactly on the unit circle so it would NOT be a K-system, and we observed that the random numbers nevertheless pass the best empirical tests of randomness (U01 Big Crush, for the experts), so we will need a to find a method which makes explicit use of the definition of 3-mixing for example. The separation of nearby trajectories is of course unable to detect the eigenvalue of modulus one, since it depends only on the largest eigenvalue.

Further reading

1. F. James and L. Moneta, Review of High-Quality Random Number Generators, https://link.springer.com/content/pdf/10.1007/s41781-019- 0034-3.pdf

2. M. Luscher, A Portable High-Quality Random Number Generator for Lattice Field Theory Simulations, Computer Physics Communications 79 (1994) 100

3. K.Savvidy, The MIXMAX random number generator, Comput.Phys.Commun. 196 (2015) 161-165. (http://dx.doi.org/10.1016/j.cpc.2015.06.003); arXiv:1404.5355

4. K. Savvidy and G. Savvidy, Spectrum and Entropy of C-systems. MIXMAX random number generator, Chaos Solitons Fractals 91 (2016) 33 [arXiv:1510.06274[math.DS]].

5. A. Sibidanov, A revision of the subtract-with-borrow random number generators, Comput. Phys. Comm. 221 (2017) 299-303

CERN Accelerating science

Random Number Generators for Precision Monte Carlos

Dynamical Systems and the Ergodic Hierarchy

From theory to computer algorithms