Easy steps to develop and publish your first R package
A step by step guide to building your own R package
The R packages are the open-source tools, generally used in analyzing or visualizing datasets. Nowadays, R packages are gaining huge popularity because of many reasons. Some of them are documented here.
The R packages are generally combinations of functions that are written in R and targeted for some data analysis functionalities. There are several packages for data wrangling, data visualization, data predictions, datasets, optimizations, test benches, performance evaluation tools, and many more. The combination of such R packages makes the task of data analysis as easy as playing with some board games.
Several researchers working in computational domain work on various interesting problem statements and end up with some meaningful code and analysis. Generally, GitHub is the preferred repository for many researchers to store and share their codes. If such codes are documented and demonstrated well, the world can utilize it with boundless possibilities. But, in alternative cases, if such codes are lacking with instruction manuals, documentation, and prerequisite sources or libraries, it is challenging to troubleshoot and reproduce the results. Besides, the version controls are the other serious issues.
On the contrary, the R packages are the complete environment for computational codes, which ensures all the covering over it, so that it can be used without any hassle. The R packages are documented well with description, vignette, sample examples, version controls, shareable code, integration with IDE (R Studio), license, and many more. Also, the R packages available in CRAN repository are well checked and can be uploaded to repository only after all warnings and errors are addressed by the package contributors, which ensures the error-free codes in the form of R package.
Many researchers avoid developing the R package due to a lack of knowledge of possibilities and the tedious procedure to develop the R packages.
This post is to demonstrate the easiest steps to develop an R package with minimum efforts. Also, it discusses the possible warnings and corresponding solutions.
Steps in developing an R package
- Launch R Studio IDE
2. Go to File > New Project, click on New Directory.
3. Then click on R Package.
4. Provide details such as Package name and the project directory as shown below and then click on Create Project.
5. The R Studio will generate a lot much stuff for the package, by default. There are four sub-screens.
a. The top-left screen is for the codes. All functions are to be written here.
b. The down-left screen is for Console and Terminal of R.
c. The top-right screen is for the environment, history, and build of the package.
d. The down-right screen shows the folders and files associated with the package.
We will go through these screens in the next steps.
6. By default, there is a sample R script in the package project, named hello.R. Let’s delete this script and write the code of the desired function. I have copied a simple function that returns the forecasted values with the PSF method (details).
7. Interestingly, the R community has maintained some decorum while publishing the R packages. The author needs to provide details of the function such as title, input variable details, output details, packages imported in the function, a possible example of the function as shown in the following figure.
These details are provided after “ #’ ” symbol.
The input parameters and packages are imported with @param and @import words. The parameters to be returned are noted after @return. While @export is a mandatory word that confirms the function will return some desired output. The authors may provide a sample example to demonstrate the function with @examples.
Then save the R script file with the desired name.
8. Let’s update the DESCRIPTION file. This file provides the details of the R package and author along with what other packages are imported in the given R package. Click on the DESCRIPTION file shown in the down-right screen of the above figure. It will pop-up the text file and update it accordingly.
Be careful while writing the text under Description. The text under this title can be written in multiple lines and the text per line must be within some word limits. Also, the next line must start with four spaces. Since R packages are open-source tools, it is important to provide a suitable license for your package. Maintaining the package version is a very important step and it has to be mentioned in the DESCRIPTION file correctly.
9. Now, let’s download two packages which are life savior in developing R packages. These packages are devtools and roxygen2. The devtools is the package provided the R community that helps with building R packages and roxygen2 is used in documenting the R package and corresponding functions.
Download the packages in the project consoles with following code chunks.
Then, load these packages to your project with the following code chunks:
Once you provided all details in the functions and description files, generate all relevant documents with a simple line of code as follows:
This code will generate/update the NAMESPACE file along with .Rd files for all functions under the man folder. These files represent the structure of the package and not directly useful for the R package users. Keep in mind that, devtools can generate the .Rd files only if the author provides the details of the package with #’ comments and uses @export at the top of the functions.
10. Along with devtools and roxygen2 packages, the R Studio is another life savior tool. The R Studio has short-cut keys to generate and update the package.
Under the top-right panel, click on the Build tab. Then click on the Install and Restart tab to install the R package within the environment.
11. If you make update any of the functions in the package, it is important to load all functions with Build > More > Load all and then again click on Install and Restart tab.
12. Now, we can say, the package is developed, but it is not ready yet to publish. Now, it is time to check the package and make it ready for the CRAN submission. The main challenge starts now. It is mandatory to make the package with 0 Notes, 0 Warnings, and 0 Errors in order to submit the package in the CRAN repository. Sometimes, it might be very frustrating to achieve these 0s, but the author needs to address them at any cost.
Let’s start checking the package with a simple step. Click on the Check tab under the Build on the right-up panel. It will check the package in several in all possible ways. The very common warnings in this checking are related to packages missing to be imported in all functions used in the package. It is impossible to discuss all errors and warnings in this post. But, if you face any warnings, errors, or notes; feel free to comment on this post.
13. Once the Check gives 0 Notes, 0 Warnings, and 0 Errors as output, it’s time to go further. Let’s create a Vignette for the package. The Vignettes are the detailed theoretical description and demonstration of the package along with some examples in the HTML format generated with RMarkdown.
First of all, follow: Build > More > Configure Build Tools.
This will pop-up a new window. Tick on Use devtool package functions if available and Generate documentation with Roxygen. Then click on Configure… tab and then tick all options available there as shown in the following figure.
The Vignette is generated with usethis package. Download and load this package with following code chunk:
Then use the following code to generate a Vignette with ‘Introduction’ title as shown below:
In the above figure, the Vignette file with .Rmd extension can be seen. It is generated with the above code. Also, you can observe there is a new folder (named, vignette) in the down-right panel.
Now, repeat the above procedure to load, Install, and restart. This will update the vignette markdown in the package folder.
14. Now, create the R package with a simple step:
Follow: Build > More > Build Source Package
This will build the R package in the root folder.
and test the package with Build > More > Test Package
Again, sometimes it is challenging to fulfill all the comments provided in the test. There can be several combinations of errors and warnings. If you face any problem in testing the package, feel free to comment on this post.
The successful passing of the testing makes the package ready to submit to CRAN. You can submit the package folder to CRAN website, here. The R community may take 5–7 days to review the package and will contact you if they find any problem in it.
Be cautious! The R community has provided a weblink to test your R package, which provided additional warnings as compared to testing done in R Studio. It is expected to address all these warnings, otherwise, the R package will get failed in the submission. Once you upload the package in this link, you will get a test report to your email within 15 minutes. The very common Note observed in this test will be related to the author’s name and email id. You can ignore this Note, but need to address all other warnings and notes in order to publish your R package successfully in the CRAN.
The accepted package get a place in CRAN repository and it can be seen there with description and Vignette files as shown below (or here):
The advantages of publishing such R packages are discussed here. Share your view and experience over it. Enjoy Publishing and feel free to comment.
Dr. Neeraj Dhanraj Bokde,
Aarhus University, Denmark