Skip to content

w-decker/rinterface

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rinterface

Quickly interface with R in Python.

Have you ever needed to run some R code in-between some Python code? Do you want to avoid figuring out rpy2? Then check out RInterface!

Installation

pip install git+https://github.com/w-decker/rinterface.git

rinterface()

A single, all-purpose function for quickly interfacing with R in Python.

Basic

At its most basic implementation rinterface() takes in a "script" of R-valid code, generates a temporary file ("temp_987rf987q234780fq389475.R"), evaluates the script using the Rscript command-line tool and then deletes the temporary file. Below is a basic example with the corresponding output.

import rinterface.rinterface as R

code = """
data(iris)
model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data=iris)
summary(model)
"""

# execute your R script
R(code)
Call:
lm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, 
    data = iris)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.82816 -0.21989  0.01875  0.19709  0.84570 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.85600    0.25078   7.401 9.85e-12 ***
Sepal.Width   0.65084    0.06665   9.765  < 2e-16 ***
Petal.Length  0.70913    0.05672  12.502  < 2e-16 ***
Petal.Width  -0.55648    0.12755  -4.363 2.41e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3145 on 146 degrees of freedom
Multiple R-squared:  0.8586,	Adjusted R-squared:  0.8557 
F-statistic: 295.5 on 3 and 146 DF,  p-value: < 2.2e-16

You can also save your script too.

R(code, save=True, fname="my_script.R")

Capturing output

You can also "capture" the output from your R script. This returns the output as a formatted string.

import rinterface.rinterface as R

code = """
data(iris)

model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data=iris)
summary(model)
"""
# execute your R script
output = R(code, capture=True)
print(output.stdout)
Call:
lm(formula = Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, 
    data = iris)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.82816 -0.21989  0.01875  0.19709  0.84570 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.85600    0.25078   7.401 9.85e-12 ***
Sepal.Width   0.65084    0.06665   9.765  < 2e-16 ***
Petal.Length  0.70913    0.05672  12.502  < 2e-16 ***
Petal.Width  -0.55648    0.12755  -4.363 2.41e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3145 on 146 degrees of freedom
Multiple R-squared:  0.8586,	Adjusted R-squared:  0.8557 
F-statistic: 295.5 on 3 and 146 DF,  p-value: < 2.2e-16

Grabbing variables

Although seeing the output of your R script in your IPython environment is convienent, it is limiting. Sometimes you might need to access the values generated in your R script. You can access these values using a simple heuristic: # @grab{type}. Here's how to do it: 1) assign the value you want to a variable. 2) On the line above this variable, write the "tag": # @grab{type} and input the type (e.g., str, int, float, etc.) that you wish to load in the variable as.

Important

The tag must be on the immediate line before the variable you wish to grab, and there must be an empty line below that variable before any new code can written.

import rinterface.rinterface as R

code = """
data(iris)

model <- lm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width, data=iris)

# @grab{float}
aic <- AIC(model)

# @grab{float}
bic <- BIC(model)
"""
# execute your R script
aic, bic = R(code, grab=True)
aic, bic
(84.64272, 99.6959)

More on grabbing

Thus far, the grabbing procedure can access arrays, matrices, integers, floats and strings in R and can load them back into Python as int, float, str, list[int], list[float], list[str], pandas.DataFrame or np.ndarray.

import rinterface.rinterface as R

code = """
# @grab{np.ndarray}
M <- matrix(1:6, 2, 3)

# @grab{list[float]}
y <- c(43.55, 3.0342, 3.23432)

# @grab{list[int]}
v <- c(10, 20, 30)

df <- data.frame(
  colA = c(1.5, 2.5, 3.5),
  colB = c(10, 20, 30),
  colC = c(100, 200, 300)
)

# @grab{pd.DataFrame}
df
"""
# execute your R script
results = R(code, grab=True)
print(results[0], type(results[0]))  
print(results[1], type(results[1]))  
print(results[2], type(results[2])) 
print(results[3], type(results[3]))
[[1. 2. 3.]
 [4. 5. 6.]] <class 'numpy.ndarray'>
[43.55, 3.0342, 3.23432] <class 'list'>
[10, 20, 30] <class 'list'>
   colA  colB   colC
0   1.5  10.0  100.0
1   2.5  20.0  200.0
2   3.5  30.0  300.0 <class 'pandas.core.frame.DataFrame'>

Warning

Grabbing is great, but it has not been thoroughly tested. Edge cases are bound to arise. Scalar values, strings and R's data.frame are the safest types in R to convert to equivalent types in Python.

Tip

If you're having trouble grabbing certain variables, here are a few suggestions: 1) write a print statement and capture the output (R(code, capture=True)). 2) Just grab the variable as a string and manipulate it yourself.

Some interface helpers

Floats, and integers are easy to integrate from your Python environment into your rinterface.rinterface script:

import rinterface.rinterface as R

x = 10

code = f"""
print({x})
"""

# execute your R script
R(code)
[1] 10

However, things like numpy arrays and pandas dataframes are more difficult. Enter rinterface.utils. With one simple function (to_r()), you can integrate your numpy arrays and pandas dataframes right into your R code at runtime.

import rinterface.rinterface as R
from rinterface.utils import to_r

from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(iris.data, columns=iris.feature_names)

code = f"""
df <- {to_r(iris)}
head(df)
"""

# execute your R script
R(code)
  sepal.length..cm. sepal.width..cm. petal.length..cm. petal.width..cm.
1               5.1              3.5               1.4              0.2
2               4.9              3.0               1.4              0.2
3               4.7              3.2               1.3              0.2
4               4.6              3.1               1.5              0.2
5               5.0              3.6               1.4              0.2
6               5.4              3.9               1.7              0.4

rinterface.utils.to_r() supports Python types str, bool, np.ndarray and pd.DataFrame

Warning

rinterface.utils has not been thoroughly tested. Edge cases and errors are bound to arise.

Backend

The motivation for a "backend" was to make rinterface more compliant with parallelization in Python and system configurations on HPCs.

Scratch

What if you want to run lots of iterations of rinterface? Currently, there is no "single session" option available at this time, so if you're looping over lots of calls to rinterface, it will generate lots of files. This is especially risky when you are running code in a storage constrained environment. Now, you can provide a specific path in which to save the temporary files generated by rinterface. Ideally, this is a scratch directory. Here's how to do it:

import rinterface.rinterface as R
import rinterface.backend as bk
bk.scratch = "~/usr/scratch/" # or something like this

code = """
x <- 4
cat(x)
"""

# execute your R script
R(code)

This will point rinterface to your scratch directory and all temporary files will be handeled at this new location. Easy!

Apptainer

It's possible that R may not be available through the Rscript command line, but through some image like Apptainer. You can now tell rinterface if your system uses Apptainer and provide a path to that installation. Here's how to do it:

import rinterface.rinterface as R
import rinterface.backend as bk
bk.command = "apptainer"
bk.apptainer_path = "path/to/image/" # you must provide the path too!

code = """
x <- 4
cat(x)
"""

# execute your R script
R(code)

This will modify the actual command executed by rinterface.

Contributing

Please submit issues or create a Pull Request. You can also email me.

About

Quickly interface with R in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages