Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure reproducibility of the full result of mindep #6

Open
rht opened this issue Mar 19, 2017 · 4 comments
Open

Ensure reproducibility of the full result of mindep #6

rht opened this issue Mar 19, 2017 · 4 comments

Comments

@rht
Copy link
Contributor

rht commented Mar 19, 2017

@Futrell, WDYT of publishing the code (maybe in a jupyter nb) that was used to create the plots summarizing the post-processed output? What if \exists a "reproducibility number" for any paper, where its count is increased whenever a peer has just validated its result. I haven't fully fleshed out yet what should be the sufficient criterion of validating a result, or if there are stages/hierarchies of criteria (perhaps it is in between of verifying a result and falsifying a result).
At least this should be about checking against systematic bugs, as opposed to attesting whether a discovery is 5-sigma certain. This could complement one rough measure of a scientific consensus, e.g. (citation number / size of a field). ...

(in short, I meant, request for the code for the fancy plots!)

@Futrell
Copy link
Owner

Futrell commented Mar 19, 2017

Sure, I can upload the plotting scripts. They've bit rotted a bit so I have to fix them up. This whole thing was originally an IPython notebook and then I slowly pulled chunks out into separate files.

I just pushed some stuff reorganizing most of the code into a cliqs folder with run_mindep.py on the outside. Sorry not to go through the pull request process--I'm not quite a github pro yet so I wasn't sure how to make a pull request on my own repo.

@Futrell
Copy link
Owner

Futrell commented Mar 19, 2017

OK, I put in the analysis scripts.

@rht
Copy link
Contributor Author

rht commented Mar 20, 2017

I see, I could almost run mindep_plots here: https://github.com/rht/cliqs/blob/jupyter/mindep_plots.ipynb (or https://nbviewer.jupyter.org/github/rht/cliqs/blob/jupyter/mindep_plots.ipynb). Each plot is chunked into separate cells.

  • stat_smooth(method="auto", mapping=aes(colour=real)) + is commented out because otherwise the plots couldn't be rendered. I wonder if there is a dependency package that needs to be downloaded
  • There is an err near the end: Error in $<-.data.frame(tmp, "p.less.than", value = "< .001"): replacement has 1 row, data has 0

(you have to activate the travis build at https://travis-ci.org/Futrell/cliqs)

@rht
Copy link
Contributor Author

rht commented Mar 20, 2017

If I were to use one of the classifications of reproducibility in http://ropensci.github.io/reproducibility-guide/sections/introduction/:

  • computational reproducibility[1]: this is still pending. There is also a question of whether the execution environment should be fully provided, preferably such that it ensures deterministic build, with a stronger guarantee than docker (via nixpkg!)[2]. Also, computational reproducibility should be easily automated, just like travis for code test cases.
  • empirical reproducibility: the datasets in the repo http://tedlab.mit.edu/datasets/cliqs/ are well documented.
  • statistical reproducibility: params are provided in the paper.

[1] https://gigascience.biomedcentral.com/articles/10.1186/s13742-016-0135-4
[2] http://www.sciencedirect.com/science/article/pii/S0167739X16000029

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants