A workflow to be an ace reproducible scientist

I. Bartomeus

Work within a project

  • Rstudio projects are self contained.

  • GitHub integration allow for version control: https://happygitwithr.com/

  • GitHub allow for larger visibility and integration with other tools (see below).

Create the actual dynamic document

  • Dynamic documents combine executable code and formatted text. Quarto is the current standard.

  • Text is annotated using Markdown flavored syntax.

  • They can easily be turned into webpages, presentations and interactive documents.

Create the actual dynamic document

  • In /demo.qmd you can play with one of this documents.

  • Or open a new empty Quarto document (upper-right corner in Rstudio).

Add a ReadMe and a licence

  • Explain people what you are doing: README.md file.

  • .md is better here because is automatically rendered by GitHub.

  • Explain people what they can do with your project: Licence it!

  • Other useful files: .gitignore, .Rproj

Licenses

  • Add a LICENCE txt file

    • Data: Creative commons: CC-BY

    • Code: MIT

  • Choose a licence

  • Add it to README.md: e.g. License: CC BY 4.0 (which you get through this code: [![License: CC BY 4.0](https://img.shields.io/badge/License-CC_BY_4.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)

If releasing code: Testthat and renv

  • Defensive programming (examples) and maybe functionalize your code

  • Unit testing (testthat)

    • Check impossible values vs improbable values
  • Track software versions (Renv, also see Docker for more complete options)

  • Code review / Pair programming

If releasing data, add Metadata

  • Standardize coding (4_NitrogenPhosphurous > 4NP > 4)
  • Never edit the master file
  • Use tidy data structures and plain, standard and open formats (e.g. csv)
  • Metadata (Dataspice: creates computer (JSON, EML) and human readable (html) metadata)

Make the document/webpage visible on github

  • In GitHub go to Settings/Pages and enable webpage creation.

  • Many options are possible, but a simple Deploy from branch, main in /docs should work.

  • GitHub will look for a index.html and a .nojekyll file.

  • You can link it to the GitHub “About” section in the right column.

Issues: Templates.

  • Explain how to contribute and report bugs (e.g. Github issues)

Automation: github actions

Github actions e.g. https://github.com/ibartomeus/CropPollinationModels

Summary:

Readme.md with DOI, version, file description, relevant info, etc…

/Data with raw, clean and the relevant scripts.

/Analysis

/Manuscript

LICENCE

testthat

News.md

Metadata/

Renv/

All will change

  • Tools evolve

  • Plan accordingly (use standard open formats)

  • Will we see living papers in our lifetime?

Acknowledgement: