Pseudorandom Number Generation
The numpy.random module supplements the built-in Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability distributions. For example, you can get a 4 × 4 array of samples from the standard normal distribution using normal:
In [1]: samples = np.random.normal(size=(4, 4))
In [2]: samples
Output: array([[ 0.14340521, -0.39313063, 0.23171811, -0.42243503],
[-0.11106257, -0.09632203, -0.75303053, 0.0169455 ],
[ 0.34445876, 1.04247109, 1.36548241, -0.78550323],
[ 0.32757408, 0.13460323, -1.03003595, 0.00847262]])
Python’s built-in random module, by contrast, only samples one value at a time. As you can see from this benchmark, numpy.random is well over an order of magnitude faster for generating very large samples:
In [3]: from random import normalvariate
In [4]: N = 1000000
In [5]: %timeit samples = [normalvariate(0, 1) for _ in range(N)]
output: 1.77 s +- 126 ms per loop (mean +- std. dev. of 7 runs, 1 loop each)
In [6]: %timeit np.random.normal(size=N)
Output: 61.7 ms +- 1.32 ms per loop (mean +- std. dev. of 7 runs, 10 loops each)
We say that these are pseudorandom numbers because they are generated by an algorithm with deterministic behavior based on the seed of the random number generator. You can change NumPy’s random number generation seed using np.random.seed:
In [7]: np.random.seed(1234)
The data generation functions in numpy.random use a global random seed. To avoid global state, you can use numpy.random.RandomState to create a random number generator isolated from others:
In [8]: rng = np.random.RandomState(1234)
In [9]: rng.randn(10)
Output: array([ 0.47143516, -1.19097569, 1.43270697, -0.3126519 , -0.72058873,
0.88716294, 0.85958841, -0.6365235 , 0.01569637, -2.24268495])
I’ll give some examples of leveraging these functions’ ability to generate large arrays of samples all at once in the next section.
Function | Description |
seed | Seed the random number generator |
permutation | Return a random permutation of a sequence, or return a permuted range |
shuffle | Randomly permute a sequence in-place |
rand | Draw samples from a uniform distribution |
randint | Draw random integers from a given low-to-high range |
randn | Draw samples from a normal distribution with mean 0 and standard deviation 1 (MATLAB-like interface) |
binomial | Draw samples from a binomial distribution |
normal | Draw samples from a normal (Gaussian) distribution |
beta | Draw samples from a beta distribution |
chisquare | Draw samples from a chi-square distribution |
gamma | Draw samples from a gamma distribution |
uniform | Draw samples from a uniform [0, 1) distribution |