Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put technical reproducibility burden completely on authors #22

Open
mstimberg opened this issue Apr 23, 2021 · 3 comments
Open

Put technical reproducibility burden completely on authors #22

mstimberg opened this issue Apr 23, 2021 · 3 comments

Comments

@mstimberg
Copy link

It is of course mainly the author's responsibility to come up with a workflow, but the current community process implies that the reviewer could/should invest additional work to make things reproducible, e.g. by making the repository binder-ready, creating a Makefile (see issue #19), etc.
I think it would be better to clearly separate the responsibility: the author provides the workflow (binder, Rmarkdown or jupyter notebook, Makefile, shell script, README file with step by step instructions, ...) and all the CODECHECKER does is following/executing this workflow, verifying that it works and documenting the exact environment that was used. The last part would be something like pip freeze, conda list or R's sessionInfo(), i.e. not a format that allows to reproduce things exactly at the press of a button (you'd need the same platform, etc.), but could give valuable hints if future replications fail or give different results.

@nuest
Copy link
Member

nuest commented May 6, 2021

Hi @mstimberg, thank you for pointing this out. I agree with the responsibilities your describe and understand the need for rewording the guidelines. In my own experhttps://github.com/IngaSchl/Label-Extractionience as a codechecker, I often do create extra configuration files or a Dockerfile, which sometimes is so quickly done that it doesn't feel like extra work.

I'm pretty sure it does not say "the reviewer should invest additional work", but I will take another look and try to make this more clear.

@nuest
Copy link
Member

nuest commented Jun 2, 2021

While adding this, we could also recommend the authors to consider using generic environments that are widely used within their community, such as rocker/geospatial images for R + Geo, or Neuro Debian for neuroscience.

HT to @rougier who pointed this out in the F1000 review.

@rougier
Copy link

rougier commented Jun 4, 2021

This is something we plan to do for ReScience using guix. The idea is to offer a +/- standard environment for each domain such that the author has only to specify what is different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants