Skip to content

Commit

Permalink
Add extended section on reproducible research
Browse files Browse the repository at this point in the history
  • Loading branch information
drnelson6 authored May 9, 2024
1 parent c08f35c commit ae5176c
Showing 1 changed file with 28 additions and 0 deletions.
28 changes: 28 additions & 0 deletions content/lessons/repro_research.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,34 @@ There are several factors that contribute to producing reproducible research. Th
- Using a version control system to facilitate collaboration on the project
- Recording your environment so others can run your code on their machine

### Reproducible research

Reproducible research is a moving target in the humanities. While the social sciences have recently reckoned with a so-called "[replication crisis](https://en.wikipedia.org/wiki/Replication_crisis)," the humanities are only beginning to think about how their research can be reproducible. As the humanities increasingly works with large data sets and computational tools that exceed what can be manually verified by a third-party observer, we need to agree upon best practices that will ensure our peers can trust the validity of our results.

As digital humanists, we can learn several lessons from the social sciences and hard sciences to avoid a "replication crisis" in the humanities. A big step towards producing more reproducible research is writing better code that others can reuse to produce the same results.

Reproducibility comes in multiple forms:

- Someone else wants to download my data and code to verify my results independently
- Someone else wants to use my code on new data to produce their own research
- Someone wants to modify my data and code to test edge cases in my results

Some key aspects of reproducible research include:

- publication of the raw underlying data used to achieve the results;
- clear documentation of the steps taken to achieve the results;
- open source release of the code used for data gathering, analysis, and other steps;
- separating code based on function (i.e. modular code development) so others can interpret and reuse your code;
- documents key decisions and changes in the project (i.e. version control);
- means to ensure the tools do what they are supposed to (i.e. tests, code review).

In addition, reproducible research follows a set of community-defined best practices to ensure that your project can be understood and used by others. These practices may evolve and change over time, but these sets of lessons contribute a set of basic principles that can guide the development of reproducible research in the Digital Humanities.

##### Resources

- Rik Peels, "Replicability and replication in the humanities." _Research Integrity and Peer Review_ 4 (2019). [https://doi.org/10.1126/science.aac4716.](https://doi.org/10.1126/science.aac4716.)
- Joseph Flanagan, "Reproducible research: Strategies, tools, and workflows." _Studies in Variation, Contacts and Change in English_, eds. Turo Hiltunen, Joe McVeigh, Tanja Säily (Helsinki: Research Unit for Variation, Contacts and Change in English, 2017). <https://varieng.helsinki.fi/series/volumes/19/flanagan/>

### Organizing your project

Your project will involve a number of components. This may include raw data, processed data, documentation, source code, and code for dependencies. You may be working as part of a team, or you may be writing code that others will use in the future. In either case, your project should be organized so that someone unfamiliar with the project can quickly find the information they need, whether these are datasets or functions in your code. In order for your research to be reproducible and make sense to others, you should organize your project in a consistent, predictable way.
Expand Down

0 comments on commit ae5176c

Please sign in to comment.