Welcome to BIOS 259 – a Stanford Biosciences mini-course on computational reproducibility
This mini-course is designed to equip graduate students and postdocs with essential skills to ensure computational research reproducibility. Through practical exercises and interactive sessions, participants will learn best practices, tools, and techniques for doing open and reproducible research. Topics covered include version control, containerization, data management, workflows, and documentation strategies. This course empowers students to overcome challenges associated with reproducibility, fostering rigorous scientific inquiry, and enhancing the credibility and impact of their computational work, while also exploring the primary causes and consequences of irreproducibility in research. Participants will gain valuable insights and practical experience in achieving computational reproducibility across various domains, including biology.
This course aims to foster a culture of reproducibility, open science, open source and collaboration in research and provide the necessary tools and skills. Through active engagement and completion of course activities, you will be able to:
- Understand the importance and causes of computational reproducibility in research
- Gain proficiency in version control systems (e.g., Git) for collaborative code and data tracking
- Create and share conda environments for software dependency management
- Utilize containerization tools (e.g., Docker, Singularity) for portable computing environments.
- Develop effective strategies for managing and documenting data and code to ensure reproducibility
- Implement best practices for transparent and reproducible project organization
Version Control with Git: Learn how to track changes, collaborate effectively, and maintain a robust version history of your code and documents.
Environment management with Conda/Mamba, Pixi: Explore Conda/Mamba, a famous package manager, and learn how to manage software dependencies in your projects efficiently and reproducibly. Additionally, we will also explore the new software package manager pixi.
Containerization with Docker and Singularity: Understand how to encapsulate your computational environment, ensuring consistent and reproducible execution across different systems.
Documentation and code and data sharing: Learn best practices for organizing, sharing, and documenting code and data to facilitate reproducibility and enable others to build upon your work.
- Basic understanding and familiarity with programming (e.g., Python, R)
- Basic understanding of Unix/Linux Bash
Date | Day | Topics covered | Time | Location | Material |
---|---|---|---|---|---|
12.02.2024 | Mon | Introduction to reproducibility and setting up | 9:30-12:30 | Edwards R166 | [Slides], Setup instructions |
12.03.2024 | Tue | Environment management (Conda/Mamba, pixi) | 9:30-12:30 | Edwards R166 | [Slides], Conda cheat sheet |
12.04.2024 | Wed | Version Control (Git/GitHub) | 9:30-12:30 | Edwards R166 | Slides, Git cheat sheet |
12.05.2024 | Thr | Containerization (Docker, Singularity) | 9:30-12:30 | Edwards R166 | [Slides], Docker cheat sheet, Excercise |
12.06.2024 | Fri | Document and share (Notebooks, FAIR data, and open code) and wrap-up | 9:30-12:30 | Edwards R166 | [Slides], Excercise |
We will add additional learning material here, meantime you can find resources list at the end of each day's slides.