How to Use R with Google Colab Notebook

How to Use R with Google Colab Notebook

This tutorial talks about how to use R with Google Colab notebook. Google Colab notebook is typically used for python in data science, but there are some R packages that are more mature than their python counterparts. So it’s convenient to use both python and R for data science projects and get the best of both languages.

In this tutorial, You will learn:

  • How to create a Colab notebook for R?
  • How to run both python and R code in the same Python Colab notebook?
  • How to switch between python dataframe and R dataframe?

Resources for this post:

Use R with Google Colab Notebook – GrabNGoInfo.com

Section 1: Create R Notebook in Google Colab

To change the Colab notebook’s default language from python to R, use the link https://colab.to/r to open a new page.

We can run R code directly in this page. For example, if we type x <- 2+3 and print the value for x, we will get the result of 5.

We can also confirm the runtime type by going to runtime -> Change runtime type. A window for Notebook settings will pop up. under Runtime type, we can see R is selected.

Google Colab R Runtime – GrabNGoInfo.com

Section 2: Run R and Python in the Same Notebook

The rpy2 package enables us to run R code in a python colab notebook using the magic command.

  • To run multiple lines of R code in a cell, put %%R at the beginning of the cell.
  • To run a single line of R code, put %R at the beginning of the line.

Section 2.1: Activate R Magic in Colab Notebook

To active the R magic command in the Google Colab notebook, use %load_ext rpy2.ipython

# activate R magic
%load_ext rpy2.ipython

Section 2.2: Install and Import R Packages in Colab Notebook

After activating the R magic, we can use the normal R code to install and import packages with the magic command %%R at the beginning of the cell.

%%R

# Install R package tableone
install.packages('tableone')

# Import the R package
library(tableone)

Section 2.3: Convert Python Dataframe to R Dataframe

To convert between python dataframe and R dataframe, we need to import packages from rpy2. We also imported pandas to create an example pandas dataframe.

# Import pandas
import pandas as pd

# Import rpy2 for dataframe conversion
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2.robjects.conversion import localconverter
from rpy2.robjects import globalenv

Firstly, let’s create a pandas dataframe using python. This dataframe has three columns and 3 rows.

# Create a pandas dataframe
df = pd.DataFrame({'fruit_name': ['apple', 'orange', 'grape'],
                  'fruit_id': [1, 2, 3],
                  'quantity': [150, 300, 200]})

# Take a look at the data
df
Convert Python Dataframe to R Dataframe – GrabNGoInfo.com

Next, let’s convert the python dataframe to the R dataframe using py2rpy. We can see that the output type is R dataframe.

# Convert the python dataframe to the R dataframe
with localconverter(ro.default_converter + pandas2ri.converter):
  dfr = ro.conversion.py2rpy(df)

# Check the type of the convertion output
type(dfr)

Output:

rpy2.robjects.vectors.DataFrame

However, when trying to run R functions on the dataframe, we got the error message of object 'dfr' not found.

%R print(summary(dfr))

Output:

RInterpreterError: Failed to parse and evaluate line 'print(summary(dfr))'.
R error message: "Error in summary(dfr) : object 'dfr' not found"

This is because the converted R dataframe is not in the global environment. After creating a variable name in R’s global environment, we can run R functions without getting error messages.

# Create a variable name in R's global environment
globalenv['dfr'] = dfr

# Print statistics
%R print(summary(dfr))

Output:

  fruit_name           fruit_id      quantity    
 Length:3           Min.   :1.0   Min.   :150.0  
 Class :character   1st Qu.:1.5   1st Qu.:175.0  
 Mode  :character   Median :2.0   Median :200.0  
                    Mean   :2.0   Mean   :216.7  
                    3rd Qu.:2.5   3rd Qu.:250.0  
                    Max.   :3.0   Max.   :300.0  
StrMatrix with 18 elements.
'Length:3...	'Class :c...	'Mode :c...	...	'Mean :...	'3rd Qu.:...	'Max. :...

Section 2.4: Convert R Dataframe to Python Dataframe

To convert R Dataframe to python dataframe, we use the rpy2py function. We can see that the output is the pandas dataframe.

# Convert R Dataframe to python dataframe
with localconverter(ro.default_converter + pandas2ri.converter):
  dfpd = ro.conversion.rpy2py(dfr)

type(dfpd)

Output:

pandas.core.frame.DataFrame

After conversion, we can operate on the dataframe directly using python code.

# Run python code on the converted python dataframe
dfpd.describe()
Convert R Dataframe to Python Dataframe – GrabNGoInfo.com

Summary

This tutorial talked about how to use R with Google Colab notebook. You learned:

  • How to create a Colab notebook for R?
  • How to run both python and R code in the same Python Colab notebook?
  • How to switch between python dataframe and R dataframe?

For more information about data science and machine learning, please check out my YouTube channel and Medium Page or follow me on LinkedIn.

Recommended Tutorials

References

1 thought on “How to Use R with Google Colab Notebook”

Leave a Comment

Your email address will not be published. Required fields are marked *