Make Your Computational Analysis Citable

This post introduces the steps needed by academic and non-academics to make their computational analysis in R citable.

Batool Almarzouq https://batoolmm.netlify.app/ (King Abdullah International Medical Research Center (KAIMRC))https://kaimrc.med.sa/
3-18-2021

Although there are overwhelming resources about licensing and citation for R software packages, there’s less attention paid to making non-package (data science) code in R citable. Academics and researchers who want to embrace Open Science practices are mostly unaware of how to make their R code citable before publishing in academic journals and what kind of license they may use to protect the intellectual property of their work.

Why make our computational analysis citable?

Books and published journal articles have always been supplemented with DOIs (digital object identifiers), a key element in the process of research and academic discourse but there’s less attention paid to unpublished computational analysis, which is usually abandoned in GitHub (or in someone’s else hard drive).

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.¶

As a researcher, we often spend long time planning and designing our project, collecting and processing data but we often can’t publish all the results and analysis if it deemed uninteresting to the journal reviewers.

As a researcher, if I find an interesting analysis in a GitHub repository (which I always do), I can’t cite it as:

From github.com/BatoolMM/MetagenomicsAnalysis

This is because the repository is inconsistent changes, the URL is unstable, and there’s no metadata (e.g. author, date, …) associated with the repository. Therefore, it’s best practice to generate a DOI and attach a metadata plus a license to the repository to generate a citation similar to:

Batool Almarzouq. (2021, June). Metagenomic analysis to the soil in Saudi Arabia. Zenodo. doi.org/10.5281/zenodo.4942110

What is a DOI?

A digital object identifier (DOI) is a persistent identifier or a unique ID to permanently identify a data, a software, an article or document and link to it on the web. These DOI are designed so your DOI links don’t break when a website gets updated.

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807.¶

DOIs are generated by publishing organizations or open-access repositories such as Zenodo. Zenodo is a general-purpose open-access repository developed under the European OpenAIRE program and operated by CERN. The previous citation was produced by Zenodo. “Zenodo helps researchers receive credit by making the research results citable. Citation information is also passed to DataCite and onto the scholarly aggregators” 1.

How can we create DOI for our R code?

1- Create a license

I can’t stress how important to create a license to any project you initiate. The license outline how other researchers can use your data or analysis. Without a license, the code is unusable by others, even if it has been publicly posted on GitHub. Adding a license to any R project is made extremely easy with usethis package. Most package developers are familiar with usethis package, which can also be extremely useful for non-package projects. You can add any license using a single line:

use_mit_license("My Name")
#> ✓ Setting License field in DESCRIPTION to 'MIT + file LICENSE'
#> ✓ Writing 'LICENSE'
#> ✓ Writing 'LICENSE.md'
#> ✓ Adding '^LICENSE\\.md$' to '.Rbuildignore'

There are many types of licenses but it is not the main focus of this article. You can read more about them from the Open Source Initiative.

2- Use git or version control in your analysis

It is best practice to use git or a type of version control when doing any kind of computational analysis. A version control system (VCS) allows you to track the iterative changes you make to your code or project. If you are not familiar with git or its one online hosting site, GitHub (https://github.com), I’d recommend that you go through this carpentry lesson which introduce git to novice coders.

Again there’s is an abundant resources and tools to use git within R, one of which is usethis package. You can read more about it here.

This is the step where you generate the DOI. You can use Zen4R to create the DOI from R/RStudio. This package was created by Emmanuel Blondel, which provides an interface to the Zenodo e-infrastructure API. The required steps are explained in this wiki but you can also do the same thing from Zenodo itself. You start by logging in to Zenodo with your GitHub, then creating a release to your GitHub repository.

Daisie Huang and Ivan Gonzalez (eds): “Software Carpentry: Version Control with Git.” Version 2016.06, June 2016, https://github.com/swcarpentry/git-novice, 10.5281/zenodo.57467.

In three simple steps within Zenodo, you can link GitHub and generate a DOI. A very good tutorial by the Carpentry is available here.

4- Make your DOI visible in your README file

Add your citation to CITATION.md or README.md in your github repository. You can also copy a badge from Zenodo with DOI to your README.md. Either way, you must add the DOI to the GitHub repository.

This way, your research outputs can be indexed, cited, and tracked, giving certainty to your scientific work.

Acknowledgments

This article was inspired by a talk about Open Data from Esther Plomp in the Open Life Science Program Cohort 3. This is a link to the talk in YouTube with captions.


  1. Find out more about Zenodo from their Website.↩︎

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/BatoolMM/Batool-s-Blabber/blob/master/_posts/2021-06-23-make-your-computational-analysis-citable/make-your-computational-analysis-citable.Rmd , unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Almarzouq (2021, March 18). Batool's Blabber: Make Your Computational Analysis Citable. Retrieved from https://batool-blabber.netlify.app/posts/2021-06-23-make-your-computational-analysis-citable/

BibTeX citation

@misc{almarzouq2021make,
  author = {Almarzouq, Batool},
  title = {Batool's Blabber: Make Your Computational Analysis Citable},
  url = {https://batool-blabber.netlify.app/posts/2021-06-23-make-your-computational-analysis-citable/},
  year = {2021}
}