Fast simulation of r.v.s from a mixture of multivariate Student's t densities
rmixt(n, mu, sigma, df, w, ncores = 1, isChol = FALSE, retInd = FALSE, A = NULL, kpnames = FALSE)
n | number of random vectors to be simulated. |
---|---|
mu | an (m x d) matrix, where m is the number of mixture components. |
sigma | as list of m covariance matrices (d x d) on for each mixture component.
Alternatively it can be a list of m cholesky decomposition of the covariance.
In that case |
df | a positive scalar representing the degrees of freedom. All the densities in the mixture have the same |
w | vector of length m, containing the weights of the mixture components. |
ncores | Number of cores used. The parallelization will take place only if OpenMP is supported. |
isChol | boolean set to true is |
retInd | when set to |
A | an (optional) numeric matrix of dimension (n x d), which will be used to store the output random variables.
It is useful when n and d are large and one wants to call |
kpnames | if |
If A==NULL
(default) the output is an (n x d) matrix where the i-th row is the i-th simulated vector.
If A!=NULL
then the random vector are store in A
, which is provided by the user, and the function
returns NULL
. Notice that if retInd==TRUE
an attribute called "index" will be added to A.
This is a vector specifying to which mixture components each random vector belongs.
There are many candidates for the multivariate generalization of Student's t-distribution, here we use the parametrization described here https://en.wikipedia.org/wiki/Multivariate_t-distribution.
Notice that this function does not use one of the Random Number Generators (RNGs) provided by R, but one
of the parallel cryptographic RNGs described in (Salmon et al., 2011). It is important to point out that this
RNG can safely be used in parallel, without risk of collisions between parallel sequence of random numbers.
The initialization of the RNG depends on R's seed, hence the set.seed()
function can be used to
obtain reproducible results. Notice though that changing ncores
causes most of the generated numbers
to be different even if R's seed is the same (see example below). NB: at the moment
the parallelization does not work properly on Solaris OS when ncores>1
. Hence, rmixt()
checks if the OS
is Solaris and, if this the case, it imposes ncores==1
John K. Salmon, Mark A. Moraes, Ron O. Dror, and David E. Shaw (2011). Parallel Random Numbers: As Easy as 1, 2, 3. D. E. Shaw Research, New York, NY 10036, USA.
# NOT RUN { # Create mixture of two components df <- 6 mu <- matrix(rep(c(1, 2, 10, 20), 2), 2, 2, byrow = TRUE) sigma <- list(diag(c(1, 10)), matrix(c(1, -0.9, -0.9, 1), 2, 2)) w <- c(0.1, 0.9) # Simulate X <- rmixt(1e4, mu, sigma, df, w, retInd = TRUE) plot(X, pch = '.', col = attr(X, "index")) # Simulate with fixed seed set.seed(414) rmixt(4, mu, sigma, df, w) set.seed(414) rmixt(4, mu, sigma, df, w) set.seed(414) rmixt(4, mu, sigma, df, w, ncores = 2) # r.v. generated on the second core are different ###### Here we create the matrix that will hold the simulated random variables upfront. A <- matrix(NA, 4, 2) class(A) <- "numeric" # This is important. We need the elements of A to be of class "numeric". set.seed(414) rmixt(4, mu, sigma, df, w, ncores = 2, A = A) # This returns NULL ... A # ... but the result is here # }