-
Notifications
You must be signed in to change notification settings - Fork 6
Handbook: A General Lab Workflow Using GitHub
Taylor Salo edited this page Mar 14, 2022
·
2 revisions
- Try to avoid graphical user interfaces (GUIs) when there is a code-based alternative available. Results from GUIs are difficult to reproduce.
- If you must use a GUI, look at the tool's documentation to determine if there is an option to generate a macro, script, or config file based on the steps taken in the GUI.
- At minimum, write out each step taken in detail.
- Part of review for each project should include having another member of the lab follow your instructions to reproduce your GUI-based results.
- All code should be pushed to GitHub regularly. This includes the following:
- Preprocessing scripts
- Statistical analyses
- Figure-generating scripts
- Job scripts
- Anything used to explore the data
- Work on forks for development, and open pull requests when you are ready for your code to be reviewed.
- Do not commit directly to NBCLab repositories.
- Request pull request reviews from folks who are working on the project or who are solid coders.
- Add branch protection rules, if necessary, to ensure that this is enforced.
- Use GitHub issues to track code- and data-related tasks.
- We can also have repository- or organization-level project boards to track issues.
- @ people when you need their help.
- Assign issues to folks when they agree to take them on.
- Remember to post summaries of pertinent offline conversations in issues.
- Trello is used for tasks related to writing and IRBs.
- Use GitHub issues and project boards to track research assistant tasks.
- You can access each research assistant's work history with a search:
- Use the search pattern "is:issue user:NBCLab assignee:username"
- E.g., https://github.com/issues?q=is%3Aissue+user%3ANBCLab+assignee%3Atsalo
- Research assistant recommendation letters should use these work histories to identify accomplishments.
- You can access each research assistant's work history with a search:
- Every repository should have a README, a license (generally Apache 2.0), and a gitignore.
- Repository names should follow lab convention:
-
[project]-project
: Base project repository, containing code for BIDSification and anonymization. -
[project]-[title-info]
: Specific subproject under the banner of the overall project. Preferably linked to a paper, poster, or talk.
-
- After analyses have been run, consider using something like
pip freeze
to generate a list of dependencies (and their versions) to cite in the manuscript. Always cite your dependencies (when those dependencies provide citation instructions, at least).
- Do not automatically watch new repositories in the organization.
- This will lead to an overabundance of GitHub notifications. You will start to ignore your GitHub notifications, which means you will miss cases where people actually need your input/help.
- Do not include any protected data or information in GitHub repositories, including in the code, issues, or comments.
- Repositories can be kept private until the preprint is uploaded, after which they should be made public.
- Do not comment on GitHub issues or pull requests via email. Your automated salutations stay in the comment and they look terrible.
- All data should also be version controlled.
- Tabular and text files should be managed with git/GitHub.
- This, of course, must be done in a HIPAA-compliant manner.
- Large files should be managed with git-annex/datalad.
- Tabular and text files should be managed with git/GitHub.
- All manuscripts should be version controlled.
- Prior to coauthor consensus, use Google Docs.
- After coauthor consensus, export to LaTeX and use Overleaf to link the manuscript to a GitHub repository.
- The manuscript can most likely be stored in a subfolder of the project repository. Something like "paper/".