Reference Guide¶
Uniform sampling Monte Carlo integration¶
mcquad samples uniformly from a hypercube. This method can also be used to integrate over more complicated volumes using the procedure described in Complex integration volumes. It will lead to large errors if the integration region is large and the integrand changes rapidly over a small fraction of the total integration region.
- skmonaco.mcquad(f, npoints, xl, xu, args=(), rng=None, nprocs=1, seed=None, batch_size=None)¶
Compute a definite integral.
Integrate f in a hypercube using a uniform Monte Carlo method.
Parameters: f : function
The integrand. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.
npoints : int
Number of points to use for the integration.
xl, xu : iterable
Iterable of length d, where d is the dimensionality of the integrand, denoting the bottom left corner and upper right corner of the integration region.
Returns: value : float or numpy array.
The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.
error : float or numpy array
An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).
Other Parameters: nprocs : int >= 1, optional
Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to 1.
seed : int, iterable or None
Seed for the random number generator. Running the integration with the same seed guarantees that identical results will be obtained (even if nprocs is different). If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.
rng : module or class, optional
Random number generator. Must expose the attributes seed and ranf. The numpy.random module by default.
batch_size : int, optional
The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.
args : list
List of arguments to pass to f when integrating.
Examples
Integrate x*y over the unit square. The true value is 1./4.
>>> mcquad(lambda x: x[0]*x[1], npoints=20000, xl=[0.,0.],xu=[1.,1.]) (0.24966..., 0.0015488...)
Calculate pi/4 by summing over all points in the unit circle that are within 1 unit of the origin.
>>> mcquad(lambda x: 1 if sum(x**2) < 1 else 0., ... npoints=20000, xl=[0.,0.], xu=[1.,1.]) (0.78550..., 0.0029024...) >>> np.pi/4. 0.7853981633974483
The integrand can return an array. This can be used to calculate several integrals at once.
>>> result, error = mcquad( ... lambda x: np.exp(-x**2)*np.array((1.,x**2,x**4,x**6)), ... npoints=20000, xl=[0.], xu=[1.]) >>> result array([ 0.7464783 , 0.18945015, 0.10075603, 0.06731908]) >>> error array([ 0.0014275 , 0.00092622, 0.00080145, 0.00069424])
Importance sampling¶
In importance sampling, the integrand is factored into the product of a probability density \(\rho(x)\) and another function \(h(x)\):
The integration proceeds by sampling from \(\rho(x)\) and calculating \(h(x)\) at each point. In scikit-monaco, this is achieved with the mcimport function.
- skmonaco.mcimport(f, npoints, distribution, args=(), dist_kwargs={}, rng=None, nprocs=1, seed=None, batch_size=None, weight=1.0)¶
Compute a definite integral, sampling from a non-uniform distribution.
This routine integrates f(x)*distribution(x) by drawing samples from distribution. Choosing distribution such that the variance of f is small will lead to much faster convergence than just using uniform sampling.
Parameters: f : function
A Python function or method to integrate. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.
npoints : int >= 2
Number of points to use in the integration.
distribution : function
A Python function or method which returns random points. distribution(size) -> numpy array of shape (size,d) where d is the dimensionality of the integral. If d==1, distribution can also return an array of shape (size,). The module numpy.random contains a large number of distributions that can be used here.
Returns: value : float or numpy array
The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.
error : float or numpy array
The standard deviation of the result. If the integrand returns an array, this will be an array of the same shape.
Other Parameters: args : tuple, optional
Extra arguments to be passed to f.
dist_kwargs : dictionary, optional
Keyword arguments to be passed to distribution.
nprocs : int >= 1, optional
Number of processes to use for the integration. 1 by default.
seed : int, iterable, optional
Seed for the random number generator. Running the integration with the same seed guarantees that identical results will be obtained (even if nprocs is different). If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.
rng : module or class, optional
Random number generator. Must expose the attributes seed by default. The numpy.random module by default.
batch_size : int, optional
The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.
weight : float, optional
Multiply the result and error by this number. 1.0 by default. This can be used when the measure of the integral is not 1.0. For instance, if one is sampling from a uniform distribution, the integration volume could be passed to weight.
Examples
Suppose that we want to integrate exp(-x**2/2) from x = -1 to 1. We can sample from the normal distribution, such that the function f is f = sqrt(2*pi) if -1. < x < 1. else 0.
>>> import numpy as np >>> from numpy.random import normal >>> f = lambda x: np.sqrt(2*np.pi) * (-1. < x < 1.) >>> npoints = 1e5 >>> mcimport(f,npoints,normal) (1.7119..., 0.00116...) >>> from scipy.special import erf >>> np.sqrt(2.*np.pi) * erf(1/np.sqrt(2.)) # exact value 1.7112...
Now suppose that we want to integrate exp(z^2) in the unit sphere (x^2 + y^2 + z^2 < 1). Since the integrand is uniform along x and y and normal along z, we choose to sample uniformly from x and y and normally along z. We can hasten the integration by using symmetry and only considering the octant (x,y,z > 0).
>>> import numpy as np >>> from numpy.random import normal, uniform >>> f = lambda (x,y,z): np.sqrt(2.*np.pi)*(z>0.)*(x**2+y**2+z**2<1.) >>> def distribution(size): ... xs = uniform(size=size) ... ys = uniform(size=size) ... zs = normal(size=size) ... return np.array((xs,ys,zs)).T >>> npoints = 1e5 >>> result, error = mcimport(f,npoints,distribution) >>> result*8,error*8 (3.8096..., 0.0248...)
The integrand can also return an array. Suppose that we want to calculate the integrals of both exp(z**2) and z**2*exp(z**2) in the unit sphere. We choose the same distribution as in the previous example, but the function that we sum is now:
>>> f = lambda (x,y,z): (np.sqrt(2.*np.pi)*(z>0.)*(x**2+y**2+z**2<1.)* ... np.array((1.,z**2))) >>> result, error = mcimport(f,npoints,distribution) >>> result*8 array([ 3.81408558, 0.67236413]) >>> error*8 array([ 0.02488709, 0.00700179])
MISER Monte Carlo¶
mcmiser samples from a hypercube using the MISER algorithm, and can also be used to integrate over more complicated volumes using the procedure described in Complex integration volumes. The algorithm is adaptive, inasmuch as it will use more points in regions where the variance of the integrand is large. It is almost certainly likely to be superior to mcquad for “complicated” integrands (integrands which are smooth over a large fraction of the integration region but with large variance in a small region), with dimensionality below about 6.
- skmonaco.mcmiser(f, npoints, xl, xu, args=(), rng=None, nprocs=1, seed=None, min_bisect=100, pre_frac=0.1, exponent=0.6666666666666666)¶
Compute a definite integral.
Integrate f in a hypercube using the MISER algorithm.
Parameters: f : function
The integrand. Must take an iterable of length d, where d is the dimennsionality of the integral, as argument, and return a float.
npoints : int
Number of points to use for the integration.
xl, xu : iterable
Iterable of length d, where d is the dimensionality of the integrand, denoting the bottom left corner and upper right corner of the integration region.
Returns: value : float
The estimate for the integral.
error : float
An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).
Other Parameters: nprocs : int >= 1, optional
Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to 1. Increasing nprocs will increase the stochastic error for a fixed number of samples (the algorithm just runs several MISER runs in parallel).
seed : int, iterable or None
Seed for the random number generator. Running the integration with the same seed and the same nprocs guarantees that identical results will be obtained. If the argument is absent, this lets the random number generator handle the seeding. If the default rng is used, this means the seed will be read from /dev/random.
rng : module or class, optional
Random number generator. Must expose the attributes seed and ranf. The numpy.random module by default.
args : list
List of arguments to pass to f when integrating.
min_bisect : int
Minimum number of points inn which to run a bisection. If the integrator has a budget of points < min_bisect for a region, it will fall back onto uniform sampling.
pre_frac : float
Fraction of points to use for pre-sampling. The MISER algorithm will use this fraction of its budget for a given area to decide how to bisect and how to apportion point budgets.
exponent : float
When allocating points to the sub-region, the algorithm will give a fraction of points proportional to range**exponent, where range is the range of the integrand in the sub-region (as estimated by using a fraction pre_frac of points). Numerical Recipes [NR] recommends a fraction of 2/3.
Notes
Unlike mcquad, the integrand cannot return an array. It must return a float.
The implementation is that proposed in Numerical Recipes [NR]: when apportioning points, we use |max-min| as an estimate of the variance in each sub-area, as opposed to calculating the variance explicitly.
References
[NR] (1, 2, 3) W. H. Press, S. A. Teutolsky, W. T. Vetterling, B. P. Flannery, “Numerical recipes: the art of scientific computing”, 3rd edition. Cambridge University Press (2007) Examples
Integrate x*y over the unit square. The correct value is 1/4.
>>> mcmiser(lambda x: x[0]*x[1], npoints=20000, xl=[0.,0.], xu=[1.,1.]) (0.249747..., 0.000170...)
Note that this is about 10 times more accurate than the equivalent call to mcquad, for the same number of points.
Utility functions¶
- skmonaco.integrate_from_points(f, points, args=(), nprocs=1, batch_size=None, weight=1.0)¶
Compute a definite integral over a set of points.
This routine evaluates f for each of the points passed, returning the average value and variance.
Parameters: f : function
A Python function or method to integrate. It must take an iterable of length d, where d is the dimensionality of the integral, as argument, and return either a float or a numpy array.
points : numpy array
A numpy array of shape (npoints,dim), where npoints is the number of points and dim is the dimentionality of the integral.
Returns: value : float or numpy array.
The estimate for the integral. If the integrand returns an array, this will be an array of the same shape.
error : float or numpy array
An estimate for the error (the integral has, approximately, a 0.68 probability of being within error of the correct answer).
Other Parameters: nprocs : int >= 1, optional
Number of processes to use concurrently for the integration. Use nprocs=1 to force a serial evaluation of the integral. This defaults to the value returned by multiprocessing.cpu_count().
batch_size : int, optional
The integration is batched, meaning that batch_size points are generated, the integration is run with these points, and the results are stored. Each batch is run by a single process. It may be useful to reduce batch_size if the dimensionality of the integration is very large.
Examples
Integrate x*y over the unit square.
>>> from numpy.random import ranf >>> npoints = int(1e5) >>> points = ranf(2*npoints).reshape((npoints,2)) # Generate some points >>> points.shape (100000,2) >>> integrate_from_points(lambda x_y:x_y[0]*x_y[1], points) (0.24885..., 0.00069...)