SC2

Statistical Computing 2

3. Using Rcpp in an R package (2)


Using C++ code from other packages within your package

In the previous section we explained how to build a basic R package containing Rcpp-based C++ code. Here we explain how you can use C++ code from other packages (e.g., RcppArmadillo) in your package.

Consider the following function, which calculates the dot-product between two vectors using RcppArmadillo:

printFile <- function(o, n = 1e5) cat(readChar(o, n))
printFile("dotArma.cpp")

// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>

using namespace Rcpp;

//' Dot product in RcppArmadillo.
//'
//' @param x1 numeric vector
//' @param x2 numeric vector
//' @return dot product, that is \code{t(x1)%*%x2}
// [[Rcpp::export(dotArma)]]
arma::vec dotArma_I(arma::vec x1, arma::vec x2)
{
  arma::vec out(1);
  out[0] = arma::as_scalar(x1.t() * x2);
  return out;
}

Notice the Rcpp::depends(RcppArmadillo) attribute at the top. To add this function to the mypack package (see the previous section), we copy the .cpp file to the appropriate directory:

system("cp dotArma.cpp mypack/src/dotArma.cpp")

We then call compileAttributes on the package folder:

library(Rcpp)
compileAttributes("mypack")
## Warning: The following packages are referenced using Rcpp::depends attributes
## however are not listed in the Depends, Imports or LinkingTo fields of the
## package DESCRIPTION file: RcppArmadillo

This creates the appropriate C++ and R wrappers in mypack/src/RcppExports.cpp and mypack/R/RcppExports.R. We are not quite ready to build the package, because compileAttributes warns us that our code depends on RcppArmadillo, but the DESCRIPTION file does not mention RcppArmadillo under LinkingTo. This is required to link our package to the RcppArmadillo header files, and must be done manually:

desc <- readLines("mypack/DESCRIPTION")
kk <- which(grepl("LinkingTo", desc))
desc[kk] <- paste0(desc[kk], ", RcppArmadillo")
writeLines(desc, "mypack/DESCRIPTION")

printFile("mypack/DESCRIPTION")
## Package: mypack
## Type: Package
## Title: What the Package Does in One 'Title Case' Line
## Version: 1.0
## Date: 2023-02-06
## Author: Your Name
## Maintainer: Your Name <your@email.com>
## Description: One paragraph description of what the package does as one
##         or more full sentences.
## License: GPL (>= 2)
## Imports: Rcpp (>= 1.0.10)
## LinkingTo: Rcpp, RcppArmadillo

We recompile the attributes again:

compileAttributes("mypack")

so that RcppArmadillo is now included in RcppExports.cpp:

printFile("mypack/src/RcppExports.cpp", 300)
// Generated by using Rcpp::compileAttributes() -> do not edit by hand
// Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

#include <RcppArmadillo.h>
#include <Rcpp.h>

using namespace Rcpp;

#ifdef RCPP_USE_GLOBAL_ROSTREAM
Rcpp::Rostream<true>&  Rcpp::Rcout = Rcpp::Rcpp_cout_get();
Rcpp::Rost

Our package is ready to be built and installed:

system("R CMD build mypack")
system("R CMD INSTALL mypack_1.0.tar.gz")

Here we used RcppArmadillo as an example, but if you want to depend on a different package (e.g., RcppGSL, RcppMLPACK, etc), the process is exactly the same.

Making your package’s C++ code callable from other packages

Having installed the mypack package, we can now use its exported R functions. For instance, we could do:

# NB do no run this, because we will re-install mypack later in the script, and 
# loading it here would cause problems
library(mypack) 
dotArma(5:1, 5:1)
# 55

If we wanted to build another package (say, secondPack) that depends on mypack, we could of course import dotArma from mypack by adding the importFrom(mypack, dotArma) line to the NAMESPACE file of secondPack. But what if we wanted to call the _mypack_dotArma_I C++ function directly in the C++ code contained in secondPack? This section explains how do this using Rcpp attributes.

Strictly speaking, if we created a package secondPack that depended on mypack (via the Depends entry in the Description file) and that containted a .Call to _mypack_dotArma_I, the code would work. But, if we were to check the package via R CMD check secondPack, we would get:

* checking foreign function calls ... NOTE
Foreign function call to a different package:
  .Call("_mypack_dotArma_I", ..., PACKAGE = "mypack")
See chapter ‘System and foreign language interfaces’ in the ‘Writing R
Extensions’ manual.

This is because calling native (e.g., C++) functions belonging to an R package from another R package via .Call is discouraged by CRAN:

“It is not portable to call compiled code in R or other packages via .Internal, .C, .Fortran, .Call or .External, since such interfaces are subject to change without notice and will probably result in your code terminating the R process.”

C++ functions contained in another package should be accessed at C++ level, but the functions need to be exported from that package. The Rcpp::interfaces attribute provides a simple way of exporting a C++ function from a package. In particular, consider the following function:

printFile("dotArma_2.cpp")
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>

// [[Rcpp::interfaces(cpp)]]

using namespace Rcpp;

// [[Rcpp::export(dotArma2)]]
arma::vec dotArma_I2(arma::vec x1, arma::vec x2)
{
  arma::vec out(1);
  out[0] = arma::as_scalar(x1.t() * x2);
  return out;
}

Here the // [[Rcpp::interfaces(cpp)]] attribute makes so that the dotArma_I will be exported from the package. To see how this works, we add this file to the source code of mypack:

system("cp dotArma_2.cpp mypack/src/dotArma_2.cpp") 

we compile the attributes:

compileAttributes("mypack")

The mypack/src/RcppExports.cpp generated by compileAttributes is quite long, and we do not print it out here. But the key lines are:

// registerCCallable (register entry points for exported C++ functions)
RcppExport SEXP _mypack_RcppExport_registerCCallable() { 
    R_RegisterCCallable("mypack", "_mypack_dotArma2", (DL_FUNC)_mypack_dotArma_I2_try);
    R_RegisterCCallable("mypack", "_mypack_RcppExport_validate", (DL_FUNC)_mypack_RcppExport_validate);
    return R_NilValue;
}

This means that the _mypack_dotArma2 function (a C++ wrapper around our dotArma_I function) has been registered as callable from other packages. To make it more easily accessible from other packages, compileAttributes also created the following header files:

printFile("mypack/inst/include/mypack.h")
## // Generated by using Rcpp::compileAttributes() -> do not edit by hand
## // Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393
## 
## #ifndef RCPP_mypack_H_GEN_
## #define RCPP_mypack_H_GEN_
## 
## #include "mypack_RcppExports.h"
## 
## #endif // RCPP_mypack_H_GEN_
printFile("mypack/inst/include/mypack_RcppExports.h")
## // Generated by using Rcpp::compileAttributes() -> do not edit by hand
## // Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393
## 
## #ifndef RCPP_mypack_RCPPEXPORTS_H_GEN_
## #define RCPP_mypack_RCPPEXPORTS_H_GEN_
## 
## #include <RcppArmadillo.h>
## #include <Rcpp.h>
## 
## namespace mypack {
## 
##     using namespace Rcpp;
## 
##     namespace {
##         void validateSignature(const char* sig) {
##             Rcpp::Function require = Rcpp::Environment::base_env()["require"];
##             require("mypack", Rcpp::Named("quietly") = true);
##             typedef int(*Ptr_validate)(const char*);
##             static Ptr_validate p_validate = (Ptr_validate)
##                 R_GetCCallable("mypack", "_mypack_RcppExport_validate");
##             if (!p_validate(sig)) {
##                 throw Rcpp::function_not_exported(
##                     "C++ function with signature '" + std::string(sig) + "' not found in mypack");
##             }
##         }
##     }
## 
##     inline arma::vec dotArma2(arma::vec x1, arma::vec x2) {
##         typedef SEXP(*Ptr_dotArma2)(SEXP,SEXP);
##         static Ptr_dotArma2 p_dotArma2 = NULL;
##         if (p_dotArma2 == NULL) {
##             validateSignature("arma::vec(*dotArma2)(arma::vec,arma::vec)");
##             p_dotArma2 = (Ptr_dotArma2)R_GetCCallable("mypack", "_mypack_dotArma2");
##         }
##         RObject rcpp_result_gen;
##         {
##             RNGScope RCPP_rngScope_gen;
##             rcpp_result_gen = p_dotArma2(Shield<SEXP>(Rcpp::wrap(x1)), Shield<SEXP>(Rcpp::wrap(x2)));
##         }
##         if (rcpp_result_gen.inherits("interrupted-error"))
##             throw Rcpp::internal::InterruptedException();
##         if (Rcpp::internal::isLongjumpSentinel(rcpp_result_gen))
##             throw Rcpp::LongjumpException(rcpp_result_gen);
##         if (rcpp_result_gen.inherits("try-error"))
##             throw Rcpp::exception(Rcpp::as<std::string>(rcpp_result_gen).c_str());
##         return Rcpp::as<arma::vec >(rcpp_result_gen);
##     }
## 
## }
## 
## #endif // RCPP_mypack_RCPPEXPORTS_H_GEN_

This looks quite complicated, but the key points that must be understood are that:

This is all that need to be done to be able to call the dotArma2 C++ function from other packages. To demonstrate this, we create a new package:

Rcpp.package.skeleton("secondPack")

We add RcppArmadillo and mypack to the LinkingTo field of its description file:

desc <- readLines("secondPack/DESCRIPTION")
kk <- which(grepl("LinkingTo", desc))
desc[kk] <- paste0(desc[kk], ", RcppArmadillo", ", mypack")
writeLines(desc, "secondPack/DESCRIPTION")

Then we add the following function:

printFile("secondDot.cpp")
## #include <RcppArmadillo.h>
## 
## #include <mypack.h>
## 
## using namespace Rcpp;
## 
## // [[Rcpp::export(secondDot)]]
## arma::vec secondDot_I(arma::vec x1, arma::vec x2)
## {
##   arma::vec out(1);
##   
##   out = mypack::dotArma2(x1, x2);
## 
##   return out;
## }

to the secondPack package:

system("cp secondDot.cpp secondPack/src/secondDot.cpp") 

Notice that secondDot.cpp includes the header mypack.h, which provides a definition of the dotArma2 function, inside the mypack namespace. Hence, to use the dotArma2 function in the new package we are following the same steps that we used to use functions from RcppArmadillo, namely:

To see whether this works, we compile and install mypack:

system("R CMD build mypack")
system("R CMD INSTALL mypack_1.0.tar.gz")

Then we compile the attributes in secondPack:

compileAttributes("secondPack")

and we install it:

system("R CMD build secondPack")
system("R CMD INSTALL secondPack_1.0.tar.gz")

Let’s see whether we can load it and use the new function:

library(secondPack)
secondDot(1:5, 1:5)
##      [,1]
## [1,]   55
t(1:5) %*% 1:5
##      [,1]
## [1,]   55

It works fine! Hence, the Rcpp::interfaces attribute allowed us to make the dotArma_I2 accessible from other packages at C++ level. There are a few thing to point out:

sourceCpp(code = '
// [[Rcpp::depends(RcppArmadillo, mypack)]]
#include <mypack.h>

using namespace Rcpp;

// [[Rcpp::export(dotSource)]]
arma::vec dotSource_I(arma::vec x1, arma::vec x2)
{
  arma::vec out(1);
  
  out = mypack::dotArma2(x1, x2);

  return out;
}
')

dotSource(3:1, 3:1)
##      [,1]
## [1,]   14

Notice that we don’t need to include RcppArmadillo.h, because it’s already included in mypack.h, while both RcppArmadillo and mypack have to appear in Rcpp::depends. This is because both packages are needed to correctly set the compilation environment (e.g., sourceCpp will use the compilation flags -I"some_folder/RcppArmadillo/include" -I"some_folder/mypack/include" to link against RcppArmadillo.h and mypack.h).

Using an R package to make a C++ library available to other packages

Suppose that we have a C++ library, which consists simply of the following header file:

printFile("smart/smart.h")
#ifndef smart_library
#define smart_library

namespace smart{

  inline double mysquare(double x){
    return x * x;
  }

}

#endif

This is called a header-only library, because the functions definition and implementation are both found in the header files, which are denoted by .h or .hpp. In contrasts, standard C++ libraries use header files to detail the definitions of the main functions that will be used by the library’s users, while the implementation and the internal functions are contained in the .cpp files. For our purposes, the advantage of header-only libraries (such as Armadillo) is that it is simple to include their code in an R package and to make so that code can be used by other R packages.

To do this, we first move the folder containing the library to the inst/include sub-folder of our package:

system("cp -a smart mypack/inst/include/smart")

We then have to change the mypack.h header (which was automatically generated by compileAttributes) to:

system("cp -a mypack_v2.h mypack/inst/include/mypack.h")
printFile("mypack/inst/include/mypack.h")
#ifndef RCPP_mypack_H_GEN_
#define RCPP_mypack_H_GEN_

#include "mypack_RcppExports.h"

#include "smart/smart.h"

#endif

The main change is that we added #include "smart/smart.h" to include the new library. We can now re-build and re-install our package:

system("R CMD build mypack")
system("R CMD INSTALL mypack_1.0.tar.gz")

And now the smart C++ library is available via our package:

sourceCpp(code = '
// [[Rcpp::depends(RcppArmadillo, mypack)]]
#include <mypack.h>

using namespace Rcpp;

// [[Rcpp::export(smartSquare)]]
NumericVector smartSquare_I(NumericVector x)
{
  NumericVector out(x.length());

  for(int ii = 0; ii < x.length(); ii++)
  {
   out[ii] = smart::mysquare(x[ii]);
  }

  return out;
}
')

smartSquare(1:5)
## [1]  1  4  9 16 25

It works! This is pretty much the mechanisms used by RcppArmadillo to make the Armadillo library available to other R packages and to Rcpp programs compiled via sourceCpp (of course, RcppArmadillo does more than that, for instance, it extends the Rcpp::wrap and Rcpp::as functions to facilitate conversion between Armadillo and Rcpp objects).

For another simple example on shipping header-only libraries via an R package, see this template. For a more complex example, see the sitmo package.