Reference Guide

Uniform sampling Monte Carlo integration

mcquad samples uniformly from a hypercube. This method can also be used to integrate over more complicated volumes using the procedure described in Complex integration volumes. It will lead to large errors if the integration region is large and the integrand changes rapidly over a small fraction of the total integration region.

skmonaco.mcquad(f, npoints, xl, xu, args=(), rng=None, nprocs=1, seed=None, batch_size=None)

Compute a definite integral.

Integrate f in a hypercube using a uniform Monte Carlo method.

Parameters:

f : function

The integrand. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.

npoints : int

Number of points to use for the integration.

xl, xu : iterable

Iterable of length d, where d is the dimensionality of the integrand, denoting the bottom left corner and upper right corner of the integration region.

Returns:

value : float or numpy array.

The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.

error : float or numpy array

An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).

Other Parameters:
 

nprocs : int >= 1, optional

Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to 1.

seed : int, iterable or None

Seed for the random number generator. Running the integration with the same seed guarantees that identical results will be obtained (even if nprocs is different). If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.

rng : module or class, optional

Random number generator. Must expose the attributes seed and ranf. The numpy.random module by default.

batch_size : int, optional

The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.

args : list

List of arguments to pass to f when integrating.

Examples

Integrate x*y over the unit square. The true value is 1./4.

>>> mcquad(lambda x: x[0]*x[1], npoints=20000, xl=[0.,0.],xu=[1.,1.])
(0.24966..., 0.0015488...)

Calculate pi/4 by summing over all points in the unit circle that are within 1 unit of the origin.

>>> mcquad(lambda x: 1 if sum(x**2) < 1 else 0.,
...     npoints=20000, xl=[0.,0.], xu=[1.,1.])
(0.78550..., 0.0029024...)
>>> np.pi/4.
0.7853981633974483

The integrand can return an array. This can be used to calculate several integrals at once.

>>> result, error = mcquad(
...      lambda x: np.exp(-x**2)*np.array((1.,x**2,x**4,x**6)),
...     npoints=20000, xl=[0.], xu=[1.])
>>> result
array([ 0.7464783 ,  0.18945015,  0.10075603,  0.06731908])
>>> error
array([ 0.0014275 ,  0.00092622,  0.00080145,  0.00069424])

Importance sampling

In importance sampling, the integrand is factored into the product of a probability density \(\rho(x)\) and another function \(h(x)\):

\[f(x) = \rho(x) h(x)\]

The integration proceeds by sampling from \(\rho(x)\) and calculating \(h(x)\) at each point. In scikit-monaco, this is achieved with the mcimport function.

skmonaco.mcimport(f, npoints, distribution, args=(), dist_kwargs={}, rng=None, nprocs=1, seed=None, batch_size=None, weight=1.0)

Compute a definite integral, sampling from a non-uniform distribution.

This routine integrates f(x)*distribution(x) by drawing samples from distribution. Choosing distribution such that the variance of f is small will lead to much faster convergence than just using uniform sampling.

Parameters:

f : function

A Python function or method to integrate. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.

npoints : int >= 2

Number of points to use in the integration.

distribution : function

A Python function or method which returns random points. distribution(size) -> numpy array of shape (size,d) where d is the dimensionality of the integral. If d==1, distribution can also return an array of shape (size,). The module numpy.random contains a large number of distributions that can be used here.

Returns:

value : float or numpy array

The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.

error : float or numpy array

The standard deviation of the result. If the integrand returns an array, this will be an array of the same shape.

Other Parameters:
 

args : tuple, optional

Extra arguments to be passed to f.

dist_kwargs : dictionary, optional

Keyword arguments to be passed to distribution.

nprocs : int >= 1, optional

Number of processes to use for the integration. 1 by default.

seed : int, iterable, optional

Seed for the random number generator. Running the integration with the same seed guarantees that identical results will be obtained (even if nprocs is different). If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.

rng : module or class, optional

Random number generator. Must expose the attributes seed by default. The numpy.random module by default.

batch_size : int, optional

The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.

weight : float, optional

Multiply the result and error by this number. 1.0 by default. This can be used when the measure of the integral is not 1.0. For instance, if one is sampling from a uniform distribution, the integration volume could be passed to weight.

Examples

Suppose that we want to integrate exp(-x**2/2) from x = -1 to 1. We can sample from the normal distribution, such that the function f is f = sqrt(2*pi) if -1. < x < 1. else 0.

>>> import numpy as np
>>> from numpy.random import normal
>>> f = lambda x: np.sqrt(2*np.pi) * (-1. < x < 1.)
>>> npoints = 1e5
>>> mcimport(f,npoints,normal)
(1.7119..., 0.00116...)
>>> from scipy.special import erf
>>> np.sqrt(2.*np.pi) * erf(1/np.sqrt(2.)) # exact value
1.7112...

Now suppose that we want to integrate exp(z^2) in the unit sphere (x^2 + y^2 + z^2 < 1). Since the integrand is uniform along x and y and normal along z, we choose to sample uniformly from x and y and normally along z. We can hasten the integration by using symmetry and only considering the octant (x,y,z > 0).

>>> import numpy as np
>>> from numpy.random import normal, uniform
>>> f = lambda (x,y,z): np.sqrt(2.*np.pi)*(z>0.)*(x**2+y**2+z**2<1.)
>>> def distribution(size):
...     xs = uniform(size=size)
...     ys = uniform(size=size)
...     zs = normal(size=size)
...     return np.array((xs,ys,zs)).T
>>> npoints = 1e5
>>> result, error = mcimport(f,npoints,distribution)
>>> result*8,error*8
(3.8096..., 0.0248...)

The integrand can also return an array. Suppose that we want to calculate the integrals of both exp(z**2) and z**2*exp(z**2) in the unit sphere. We choose the same distribution as in the previous example, but the function that we sum is now:

>>> f = lambda (x,y,z): (np.sqrt(2.*np.pi)*(z>0.)*(x**2+y**2+z**2<1.)*
...     np.array((1.,z**2)))
>>> result, error = mcimport(f,npoints,distribution)
>>> result*8
array([ 3.81408558,  0.67236413])
>>> error*8
array([ 0.02488709,  0.00700179])

MISER Monte Carlo

mcmiser samples from a hypercube using the MISER algorithm, and can also be used to integrate over more complicated volumes using the procedure described in Complex integration volumes. The algorithm is adaptive, inasmuch as it will use more points in regions where the variance of the integrand is large. It is almost certainly likely to be superior to mcquad for “complicated” integrands (integrands which are smooth over a large fraction of the integration region but with large variance in a small region), with dimensionality below about 6.

skmonaco.mcmiser(f, npoints, xl, xu, args=(), rng=None, nprocs=1, seed=None, min_bisect=100, pre_frac=0.1, exponent=0.6666666666666666)

Compute a definite integral.

Integrate f in a hypercube using the MISER algorithm.

Parameters:

f : function

The integrand. Must take an iterable of length d, where d is the dimennsionality of the integral, as argument, and return a float.

npoints : int

Number of points to use for the integration.

xl, xu : iterable

Iterable of length d, where d is the dimensionality of the integrand, denoting the bottom left corner and upper right corner of the integration region.

Returns:

value : float

The estimate for the integral.

error : float

An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).

Other Parameters:
 

nprocs : int >= 1, optional

Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to 1. Increasing nprocs will increase the stochastic error for a fixed number of samples (the algorithm just runs several MISER runs in parallel).

seed : int, iterable or None

Seed for the random number generator. Running the integration with the same seed and the same nprocs guarantees that identical results will be obtained. If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.

rng : module or class, optional

Random number generator. Must expose the attributes seed and ranf. The numpy.random module by default.

args : list

List of arguments to pass to f when integrating.

min_bisect : int

Minimum number of points inn which to run a bisection. If the integrator has a budget of points < min_bisect for a region, it will fall back onto uniform sampling.

pre_frac : float

Fraction of points to use for pre-sampling. The MISER algorithm will use this fraction of its budget for a given area to decide how to bisect and how to apportion point budgets.

exponent : float

When allocating points to the sub-region, the algorithm will give a fraction of points proportional to range**exponent, where range is the range of the integrand in the sub-region (as estimated by using a fraction pre_frac of points). Numerical Recipes [NR] recommends a fraction of 2/3.

Notes

Unlike mcquad, the integrand cannot return an array. It must return a float.

The implementation is that proposed in Numerical Recipes [NR]: when apportioning points, we use |max-min| as an estimate of the variance in each sub-area, as opposed to calculating the variance explicitly.

References

[NR](1, 2, 3) W. H. Press, S. A. Teutolsky, W. T. Vetterling, B. P. Flannery, “Numerical recipes: the art of scientific computing”, 3rd edition. Cambridge University Press (2007)

Examples

Integrate x*y over the unit square. The correct value is 1/4.

>>> mcmiser(lambda x: x[0]*x[1], npoints=20000, xl=[0.,0.], xu=[1.,1.])
(0.249747..., 0.000170...)

Note that this is about 10 times more accurate than the equivalent call to mcquad, for the same number of points.

Utility functions

skmonaco.integrate_from_points(f, points, args=(), nprocs=1, batch_size=None, weight=1.0)

Compute a definite integral over a set of points.

This routine evaluates f for each of the points passed, returning the average value and variance.

Parameters:

f : function

A Python function or method to integrate. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.

points : numpy array

A numpy array of shape (npoints,dim), where npoints is the number of points and dim is the dimentionality of the integral.

Returns:

value : float or numpy array.

The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.

error : float or numpy array

An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).

Other Parameters:
 

nprocs : int >= 1, optional

Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to the value returned by multiprocessing.cpu_count().

batch_size : int, optional

The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.

Examples

Integrate x*y over the unit square.

>>> from numpy.random import ranf
>>> npoints = int(1e5)
>>> points = ranf(2*npoints).reshape((npoints,2)) # Generate some points
>>> points.shape
(100000,2)
>>> integrate_from_points(lambda x_y:x_y[0]*x_y[1], points) 
(0.24885..., 0.00069...)