Fast simulation of r.v.s from a mixture of multivariate Student's t densities

rmixt(n, mu, sigma, df, w, ncores = 1, isChol = FALSE, retInd = FALSE,
  A = NULL, kpnames = FALSE)

Arguments

n

number of random vectors to be simulated.

mu

an (m x d) matrix, where m is the number of mixture components.

sigma

as list of m covariance matrices (d x d) on for each mixture component. Alternatively it can be a list of m cholesky decomposition of the covariance. In that case isChol should be set to TRUE.

df

a positive scalar representing the degrees of freedom. All the densities in the mixture have the same df.

w

vector of length m, containing the weights of the mixture components.

ncores

Number of cores used. The parallelization will take place only if OpenMP is supported.

isChol

boolean set to true is sigma is the cholesky decomposition of the covariance matrix.

retInd

when set to TRUE an attribute called "index" will be added to the output matrix of random variables. This is a vector specifying to which mixture components each random vector belongs. FALSE by default.

A

an (optional) numeric matrix of dimension (n x d), which will be used to store the output random variables. It is useful when n and d are large and one wants to call rmvn() several times, without reallocating memory for the whole matrix each time. NB: the element of A must be of class "numeric".

kpnames

if TRUE the dimensions' names are preserved. That is, the i-th column of the output has the same name as the i-th entry of mu or the i-th column of sigma. kpnames==FALSE by default.

Value

If A==NULL (default) the output is an (n x d) matrix where the i-th row is the i-th simulated vector. If A!=NULL then the random vector are store in A, which is provided by the user, and the function returns NULL. Notice that if retInd==TRUE an attribute called "index" will be added to A. This is a vector specifying to which mixture components each random vector belongs.

Details

There are many candidates for the multivariate generalization of Student's t-distribution, here we use the parametrization described here https://en.wikipedia.org/wiki/Multivariate_t-distribution.

Notice that this function does not use one of the Random Number Generators (RNGs) provided by R, but one of the parallel cryptographic RNGs described in (Salmon et al., 2011). It is important to point out that this RNG can safely be used in parallel, without risk of collisions between parallel sequence of random numbers. The initialization of the RNG depends on R's seed, hence the set.seed() function can be used to obtain reproducible results. Notice though that changing ncores causes most of the generated numbers to be different even if R's seed is the same (see example below). NB: at the moment the parallelization does not work properly on Solaris OS when ncores>1. Hence, rmixt() checks if the OS is Solaris and, if this the case, it imposes ncores==1

References

John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw (2011). Parallel Random Numbers: As Easy as 1, 2, 3. D. E. Shaw Research, New York, NY 10036, USA.

Examples

# NOT RUN {
# Create mixture of two components
df <- 6
mu <- matrix(rep(c(1, 2, 10, 20), 2), 2, 2, byrow = TRUE)
sigma <- list(diag(c(1, 10)), matrix(c(1, -0.9, -0.9, 1), 2, 2))
w <- c(0.1, 0.9)

# Simulate
X <- rmixt(1e4, mu, sigma, df, w, retInd = TRUE)
plot(X, pch = '.', col = attr(X, "index"))

# Simulate with fixed seed
set.seed(414)
rmixt(4, mu, sigma, df, w)

set.seed(414)
rmixt(4, mu, sigma, df, w)

set.seed(414)
rmixt(4, mu, sigma, df, w, ncores = 2) # r.v. generated on the second core are different

###### Here we create the matrix that will hold the simulated random variables upfront.
A <- matrix(NA, 4, 2)
class(A) <- "numeric" # This is important. We need the elements of A to be of class "numeric". 

set.seed(414)
rmixt(4, mu, sigma, df, w, ncores = 2, A = A) # This returns NULL ...
A                                             # ... but the result is here

# }