CPE 619 Testing Random-Number Generators
Aleksandar Milenkovi
The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville [Link] [Link]
Overview
Chi-square test Kolmogorov-Smirnov Test Serial-correlation Test Two-level tests K-dimensional uniformity or k-distributivity Serial Test Spectral Test
Testing Random-Number Generators
Goal: To ensure that the random number generator produces a random stream Plot histograms Plot quantile-quantile plot Use other tests Passing a test is necessary but not sufficient Pass Good Fail Bad New tests Old generators fail the test Tests can be adapted for other distributions
Chi-Square Test
Most commonly used test Can be used for any distribution Prepare a histogram of the observed data Compare observed frequencies with theoretical
k = Number of cells oi = Observed frequency for ith cell ei = Expected frequency
D=0 Exact fit D has a chi-square distribution with k-1 degrees of freedom. Compare D with c2[1-a; k-1] Pass with confidence a if D is less
Example 27.1
1000 random numbers with x0 = 1 Observed difference = 10.380 Observed is Less Accept IID U(0, 1)
Chi-Square for Other Distributions
Errors in cells with a small ei affect the chi-square statistic more Best when ei's are equal
Use an equi-probable histogram with variable cell sizes
Combine adjoining cells so that the new cell probabilities are approximately equal The number of degrees of freedom should be reduced to k-r-1 (in place of k-1), where r is the number of parameters estimated from the sample Designed for discrete distributions and for large sample sizes only Lower significance for finite sample sizes and continuous distributions If less than 5 observations, combine neighboring cells
Kolmogorov-Smirnov Test
Developed by A. N. Kolmogorov and N. V. Smirnov Designed for continuous distributions Difference between the observed CDF (cumulative distribution function) Fo(x) and the expected cdf Fe(x) should be small
Kolmogorov-Smirnov Test
K+ = maximum observed deviation below the expected cdf K- = minimum observed deviation below the expected cdf
K+ < K[1-a;n] and K- < K[1-a;n] Pass at a level of significance Don't use max/min of Fe(xi)-Fo(xi) Use Fe(xi+1)-Fo(xi) for KFor U(0, 1): Fe(x)=x Fo(x) = j/n, where x > x1, x2, ..., xj-1
Example 27.2
30 Random numbers using a seed of x0=15:
The numbers are: 14, 11, 2, 6, 18, 23, 7, 21, 1, 3, 9, 27, 19, 26, 16, 17, 20, 29, 25, 13, 8, 24, 10, 30, 28, 22, 4, 12, 5, 15.
Example 27.2 (contd)
The normalized numbers obtained by dividing the sequence by 31 are: 0.45161, 0.35484, 0.06452, 0.19355, 0.58065, 0.22581, 0.67742, 0.03226, 0.09677, 0.29032, 0.61290, 0.83871, 0.51613, 0.54839, 0.64516, 0.80645, 0.41935, 0.25806, 0.77419, 0.32258, 0.90323, 0.70968, 0.12903, 0.38710, 0.16129,
0.74194, 0.87097, 0.93548, 0.96774, 0.48387.
10
Example 27.2 (contd)
K[0.9;n] value for n = 30 and a = 0.1 is 1.0424
Observed<Table Pass
11
Chi-square vs. K-S Test
12
Serial-Correlation Test
Nonzero covariance Dependence. The inverse is not true Rk = Autocovariance at lag k = Cov[xn, xn+k]
For large n, Rk is normally distributed with a mean of zero and a variance of 1/[144(n-k)] 100(1-a)% confidence interval for the autocovariance is:
For k1 Check if CI includes zero For k = 0, R0= variance of the sequence Expected to be 1/12 for IID U(0,1)
13
Example 27.3: Serial Correlation Test
10,000 random numbers with x0=1:
14
Example 27.3 (contd)
All confidence intervals include zero All covariances are statistically insignificant at 90% confidence.
15
Two-Level Tests
If the sample size is too small, the test results may apply locally, but not globally to the complete cycle. Similarly, global test may not apply locally Use two-level tests Use Chi-square test on n samples of size k each and then use a Chi-square test on the set of n Chi-square statistics so obtained Chi-square on Chi-square test. Similarly, K-S on K-S Can also use this to find a ``nonrandom'' segment of an otherwise random sequence.
16
k-Distributivity
k-Dimensional Uniformity Chi-square uniformity in one dimension Given two real numbers a1 and b1 between 0 and 1 such that b1 > a1
This is known as 1-distributivity property of un. The 2-distributivity is a generalization of this property in two dimensions:
For all choices of a1, b1, a2, b2 in [0, 1], b1>a1 and b2>a2
17
k-Distributivity (contd)
k-distributed if:
For all choices of ai, bi in [0, 1], with bi>ai, i=1, 2, ..., k. k-distributed sequence is always (k-1)-distributed. The inverse is not true. Two tests: 1. Serial test 2. Spectral test 3. Visual test for 2-dimensions: Plot successive overlapping pairs of numbers
18
Example 27.4
Tausworthe sequence generated by:
The sequence is kdistributed for k up to d /l e, that is, k=1. In two dimensions: Successive overlapping pairs (xn, xn+1)
19
Example 27.5
Consider the polynomial:
Better 2-distributivity than Example 27.4
20
Serial Test
Goal: To test for uniformity in two dimensions or higher. In two dimensions, divide the space between 0 and 1 into K2 cells of equal area
xn +1
xn
21
Serial Test (contd)
Given {x1, x2,, xn}, use n/2 non-overlapping pairs (x1, x2), (x3, x4), and count the points in each of the K2 cells Expected= n/(2K2) points in each cell Use chi-square test to find the deviation of the actual counts from the expected counts The degrees of freedom in this case are K2-1 For k-dimensions: use k-tuples of non-overlapping values k-tuples must be non-overlapping Overlapping number of points in the cells are not independent chi-square test cannot be used In visual check one can use overlapping or non-overlapping In the spectral test overlapping tuples are used Given n numbers, there are n-1 overlapping pairs, n/2 nonoverlapping pairs
22
Spectral Test
Goal: To determine how densely the k-tuples {x1, x2, , xk} can fill up the k-dimensional hyperspace The k-tuples from an LCG fall on a finite number of parallel hyper-planes Successive pairs would lie on a finite number of lines In three dimensions, successive triplets lie on a finite number of planes
23
Example 27.6: Spectral Test
Plot of overlapping pairs
All points lie on three straight lines.
Or:
24
Example 27.6 (contd)
In three dimensions, the points (xn, xn-1, xn-2) for the above generator would lie on five planes given by:
Obtained by adding the following to equation
Note that k+k1 will be an integer between 0 and 4.
25
Spectral Test (More)
Marsaglia (1968): Successive k-tuples obtained from an LCG fall on, at most, (k!m)1/k parallel hyper-planes, where m is the modulus used in the LCG. Example: m = 232, fewer than 2,953 hyper-planes will contain all 3-tuples, fewer than 566 hyper-planes will contain all 4tuples, and fewer than 41 hyper-planes will contain all 10tuples. Thus, this is a weakness of LCGs. Spectral Test: Determine the max distance between adjacent hyper-planes. Larger distance worse generator In some cases, it can be done by complete enumeration
26
Example 27.7
Compare the following two generators:
Using a seed of x0=15, first generator:
Using the same seed in the second generator:
27
Example 27.7 (contd)
Every number between 1 and 30 occurs once and only once Both sequences will pass the chi-square test for uniformity
28
Example 27.7 (contd)
First Generator:
29
Example 27.7 (contd)
Three straight lines of positive slope or ten lines of negative slope Since the distance between the lines of positive slope is more, consider only the lines with positive slope
Distance between two parallel lines y=ax+c1 and y=ax+c2 is given by The distance between the above lines is or 9.80
30
Example 27.7 (contd)
Second Generator:
31
Example 27.7 (contd)
All points fall on seven straight lines of positive slope or six straight lines of negative slope. Considering lines with negative slopes:
The distance between lines is: or 5.76. The second generator has a smaller maximum distance and, hence, the second generator has a better 2-distributivity The set with a larger distance may not always be the set with fewer lines
32
Example 27.7 (contd)
Either overlapping or non-overlapping k-tuples can be used
With overlapping k-tuples, we have k times as many points, which makes the graph visually more [Link] number of hyperplanes and the distance between them are the same with either choice.
With serial test, only non-overlapping k-tuples should be used. For generators with a large m and for higher dimensions, finding the maximum distance becomes quite complex. See Knuth (1981)
33
Summary
Chi-square test is a one-dimensional test Designed for discrete distributions and large sample sizes K-S test is designed for continuous variables Serial correlation test for independence Two level tests find local non-uniformity k-dimensional uniformity = k-distributivity tested by spectral test or serial test
34
Random Variate Generation
Overview
Inverse transformation Rejection Composition Convolution Characterization
36
Random-Variate Generation
General Techniques Only a few techniques may apply to a particular distribution Look up the distribution in Chapter 29
37
Inverse Transformation
Used when F-1 can be determined either analytically or empirically
1.0 u CDF F(x) 0.5 0
x
38
Proof
39
Example 28.1
For exponential variates:
If u is U(0,1), 1-u is also U(0,1) Thus, exponential variables can be generated by:
40
Example 28.2
The packet sizes (trimodal) probabilities:
The CDF for this distribution is:
41
Example 28.2 (contd)
The inverse function is:
Note: CDF is continuous from the right the value on the right of the discontinuity is used The inverse function is continuous from the left u=0.7 x=64
42
Applications of the InverseTransformation Technique
43
Rejection
Can be used if a pdf g(x) exists such that c g(x) majorizes the pdf f(x) c g(x) > f(x) 8 x Steps:
1. Generate x with pdf g(x) 2. Generate y uniform on [0, cg(x)] 3. If y < f(x), then output x and return Otherwise, repeat from step 1 Continue rejecting the random variates x and y until y > f(x)
Efficiency = how closely c g(x) envelopes f(x) Large area between c g(x) and f(x) Large percentage of (x, y) generated in steps 1 and 2 are rejected If generation of g(x) is complex, this method may not be efficient
44
Example 28.2
Beta(2,4) density function:
Bounded inside a rectangle of height 2.11 Steps:
Generate x uniform on [0, 1] Generate y uniform on [0, 2.11] If y < 20 x(1-x)3, then output x and return Otherwise repeat from step 1
45
Composition
Can be used if CDF F(x) = Weighted sum of n other CDFs.
Here, , and Fi's are distribution functions. n CDFs are composed together to form the desired CDF Hence, the name of the technique. The desired CDF is decomposed into several other CDFs Also called decomposition Can also be used if the pdf f(x) is a weighted sum of n other pdfs:
46
Steps: Generate a random integer I such that:
This can easily be done using the inversetransformation method. Generate x with the ith pdf fi(x) and return.
47
Example 28.4
pdf: Composition of two exponential pdf's Generate
If u1<0.5, return; otherwise return x=a ln u2. Inverse transformation better for Laplace
48
Convolution
Sum of n variables: Generate n random variate yi's and sum For sums of two variables, pdf of x = convolution of pdfs of y1 and y2. Hence the name Although no convolution in generation If pdf or CDF = Sum Composition Variable x = Sum Convolution
49
Convolution: Examples
Erlang-k = i=1k Exponentiali Binomial(n, p) = i=1n Bernoulli(p) Generated n U(0,1), return the number of RNs less than p c2(n) = i=1n N(0,1)2 G(a, b1)+G(a,b2)=G(a,b1+b2) Non-integer value of b = integer + fraction i=1n Any = Normal U(0,1) = Normal i=1m Geometric = Pascal i=12 Uniform = Triangular
50
Characterization
Use special characteristics of distributions characterization Exponential inter-arrival times Poisson number of arrivals Continuously generate exponential variates until their sum exceeds T and return the number of variates generated as the Poisson variate. The ath smallest number in a sequence of a+b+1 U(0,1) uniform variates has a b(a, b) distribution. The ratio of two unit normal variates is a Cauchy(0, 1) variate. A chi-square variate with even degrees of freedom c2(n) is the same as a gamma variate g(2,n/2). If x1 and x2 are two gamma variates g(a,b) and g(a,c), respectively, the ratio x1/(x1+x2) is a beta variate b(b,c). If x is a unit normal variate, em+s x is a lognormal(m, s) variate.
51
Summary
Is CDF invertible? Yes Use inversion
Is CDF a sum of other CDFs?
Yes
Use composition
Is pdf a sum of other pdfs?
Yes
Use Composition
52
Summary (contd)
Is the variate a sum of other variates Is the variate related to other variates? Does a majorizing function exist? No Use empirical inversion
53
Yes
Use convolution
Yes
Use characterization
Yes
Use rejection
Homework #6
Submit answers to exercise 27.1 Submit answers to exercise 27.4
Due: Monday, April 7, 2008, 12:45 PM Submit a hard copy to instructor
54