12
Random-number generators
It is important to be able to efficiently generate independent random varia-
bles from the uniform distribution on (0, 1), since:
• Random variables from all other distributions can be obtained by trans-
forming uniform random variables;
• Simulations require many random numbers.
/k 1/20
12
Most random-number generators are of the form:
Start with z0 (seed)
For n = 1, 2, . . . generate
zn = f (zn−1)
and
un = g(zn)
f is the pseudo-random generator
g is the output function
{u0, u1, . . .} is the sequence of uniform random numbers on the interval
(0, 1).
/k 2/20
12
A ‘good’ random-number generator should satisfy the following properties:
• Uniformity: The numbers generated appear to be distributed uniformly
on (0, 1);
• Independence: The numbers generated show no correlation with each
other;
• Replication: The numbers should be replicable (e.g., for debugging or
comparison of different systems).
• Cycle length: It should take long before numbers start te repeat;
• Speed: The generator should be fast;
• Memory usage: The generator should not require a lot of storage.
/k 3/20
12
Linear (or mixed) congruential generators
Most random-number generators in use today are linear congruential genera-
tors. They produce a sequence of integers between 0 and m − 1 according
to
zn = (azn−1 + c) mod m, n = 1, 2, . . .
a is the multiplier, c the increment and m the modulus.
To obtain uniform random numbers on (0, 1) we take
un = zn/m
A good choice of a, c and m is very important.
/k 4/20
12
A linear congruential generator has full period (cycle length is m) if and only
if the following conditions hold:
• The only positive integer that exactly divides both m and c is 1;
• If q is a prime number that divides m, then q divides a − 1;
• If 4 divides m, then 4 divides a − 1.
/k 5/20
12
Multiplicative congruential generators
These generators produce a sequence of integers between 0 and m − 1 ac-
cording to
zn = azn−1 mod m, n = 1, 2, . . .
So they are linear congruential generators with c = 0.
They cannot have full period, but it is possible to obtain period m − 1 (so
each integer 1, ..., m − 1 is obtained exactly once in each cycle) if a and m
are chosen carefully. For example, as a = 630360016 and m = 231 − 1.
/k 6/20
12
Additive congruential generators
These generators produce integers according to
zn = (zn−1 + zn−k ) mod m, n = 1, 2, . . .
where k ≥ 2. Uniform random numbers can again be obtained from
un = zn/m
These generators can have a long period upto mk .
Disadvantage:
Consider the case k = 2 (the Fibonacci generator). If we take three consecu-
tive numbers un−2 , un−1 and un , then it will never happen that
un−2 < un < un−1 or un−1 < un < un−2
whereas for true uniform variables both of these orderings occurs with pro-
bability 1/6.
/k 7/20
12
(Pseudo) Random number generators:
• Linear (or mixed) congruential generators
• Multiplicative congruential generators
• Additive congruential generators
• ...
How random are pseudorandom numbers?
/k 8/20
12
Testing random number generators
Try to test two main properties:
• Uniformity;
• Independence.
/
k 9/20
12
Uniformity or goodness-of-fit tests:
Let X1 , . . . , Xn be n observations. A goodness-of-fit test can be used to test
the hyphothesis:
H0: The Xi’s are i.i.d. random variables with distribution function F .
Two goodness-of-fit tests:
• Kolmogorov-Smirnov test
• Chi-Square test
/k 10/20
12
Kolmogorov-Smirnov test
Let Fn (x) be the emperical distribution function, so
number ofXi0 s ≤ x
Fn(x) =
n
Then
Dn = sup |Fn(x) − F (x)|
x
has the Kolmogorov-Smirnov (K-S) distribution.
Now we reject H0 if
Dn > dn,1−α
where dn,1−α is the 1 − α quantile of the K-S distribution.
Here α is the significance level of the test:
The probability of rejecting H0 given that H0 is true.
/k 11/20
12
For n ≥ 100, √
dn,0.95 ≈ 1.3581/ n
In case of the uniform distribution we have
F (x) = x, 0 ≤ x ≤ 1.
/k 12/20
12
Chi-Square test
Divide the range of F into k adjacent intervals
(a0, a1], (a1, a2], . . . , (ak−1, ak ]
Let
Nj = number of Xi’s in [aj−1, aj )
and let pj be the probability of an outcome in (aj−1 , aj ], so
pj = F (aj ) − F (aj−1)
Then the test statistic is
k
2
X (Nj − npj )2
χ =
j=1
npj
If H0 is true, then npj is the expected number of the n Xi ’s that fall in the
j -th interval, and so we expect χ2 to be small.
/
k 13/20
12
If H0 is true, then the distribution of χ2 converges to a chi-square distribu-
tion with k − 1 degrees of freedom as n → ∞.
The chi-square distribution with k − 1 degrees of freedom is the same as
the Gamma distribution with parameters (k − 1)/2 and 2.
Hence, we reject H0 if
χ2 > χ2k−1,1−α
where χ2k−1,1−α is the 1 − α quantile of the chi-square distribution with k − 1
degrees of freedom.
/k 14/20
12
Chi-square test for U (0, 1) random variables
We divide (0, 1) into k subintervals of equal length and generate U1 , . . . , Un ;
it is recommended to choose k ≥ 100 and n/k ≥ 5. Let Nj be the number
of the n Ui ’s in the j -th subinterval.
Then
k
2 k X n 2
χ = Nj −
n j=1 k
/k 15/20
12
Example:
Consider the linear congruential generator
zn = azn−1 mod m
with a = 630360016, m = 231 − 1 and seed
z0 = 1973272912
Generating n = 215 = 32768 random numbers Ui and dividing (0, 1) in
k = 212 = 4096 subintervals yields
χ2 = 4141.0
Since
χ4095,0.9 ≈ 4211.4
we do not reject H0 at level α = 0.1.
/k 16/20
12
Serial test
This is a 2-dimensional version of the chi-square test to test independence
between successive observations.
We generate U1 , . . . , U2n ; if the Ui ’s are really i.i.d. U (0, 1), then the non-
overlapping pairs
(U1, U2), (U3, U4), . . . , (U2n−1, U2n)
are i.i.d. random vectors uniformly distributed in the square (0, 1)2 .
• Divide the square (0, 1)2 into n2 subsquares;
• Count how many outcomes fall in each subsquare;
• Apply a chi-square test to these data.
This test can be generalized to higher dimensions.
/ k 17/20
12
Permutation test
Look at n successive d-tuples of outcomes
(U0, . . . , Ud−1), (Ud, . . . , U2d−1),
. . . , (U(n−1)d, . . . , Und−1);
Among the d-tuples there are d! possible orderings and these orderings are
equally likely.
• Determine the frequencies of the different orderings among the n d-
tuples;
• Apply a chi-square test to these data.
/ k 18/20
12
Runs-up test
Divide the sequence U0 , U1 , . . . in blocks, where each block is a subse-
quence of increasing numbers followed by a number that is smaller than its
predecessor.
Example: The realization 1,3,8,6,2,0,7,9,5 can be divided in the blocks
(1,3,8,6), (2,0), (7,9,5).
A block consisting of j + 1 numbers is called a run-up of length j . It holds
that
1 1
P (run-up of length j) = −
j! (j + 1)!
• Generate n run-ups;
• Count the number of run-ups of length 0, 1, 2, . . . , k − 1 and ≥ k ;
• Apply a chi-square test to these data.
/k 19/20
12
Correlation test
Generate U0 , U1 , . . . , Un and compute an estimate for the (serial) correlation
Pn
− Ū (n))(Ui+1 − Ū (n))
i=1 (Ui
ρ̂1 = Pn
i=1 (Ui − Ū (n))
2
where Un+1 = U1 and Ū (n) the sample mean.
If the Ui ’s are really i.i.d. U (0, 1), then ρ̂1 should be close to zero. Hence we
reject H0 is ρ̂1 is too large.
If H0 is true, then for large n,
√ √
P (−2/ n ≤ ρ̂1 ≤ 2/ n) ≈ 0.95
So we reject H0 at the 5% level if
√ √
ρ̂1 ∈
/ (−2/ n, 2/ n)
/
k 20/20