Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

submission 687 #64

Merged
merged 3 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions submissions/687/_quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
project:
type: manuscript

manuscript:
article: index.qmd

format:
html: default
52 changes: 52 additions & 0 deletions submissions/687/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
submission_id: 687
categories: 'Session 5A'
title: Go Digital, They Said. It Will Be Fun, They Said. Teaching DH Methods for Historical Research
author:
- name: Ina Serif
orcid: 0000-0003-2419-4252
email: [email protected]
affiliation: University Basel, Switzerland
keywords:
- teaching computer-assisted methods
- digital history
- digital literacy
abstract: |
The digitization of historical materials and the application of computational techniques significantly expand the spectrum of sources and questions for historical research. However, the practical use of computer-assisted methods often involves resolving technical problems unique to a specific project. When teaching such methods to history students, this is the major challenge: there isn't a simple set of commands that covers all the potential issues in a research project. Moreover, the goal is not to train humanities students to be computer scientists, but to equip them with the skills to tackle specific problems. I will discuss how, based on problems faced in my own research, I combine the teaching of computer-assisted methods with student projects to help the students understand the limitations of out-of-the-box solutions while letting them experience the possibilities of digital analyses. Through their own project, students learn how to break down research questions into separate, manageable technical tasks and identify which types of problems can and which can’t be resolved using digital history methods.
date: 09-09-2024
bibliography: references.bib
---

## Introduction
As historians today, we profit from an unmatched availability of historical sources online, with most of the information contained in these sources digitally accessible. This greatly facilitates the use of computer-assisted methods to support or augment historical analyses. How and when to use which methods in a research endeavor are questions that cannot easily be answered, as the application of appropriate techniques more often than not is something to be clarified or revised during a project. Therefore, we need to find a way to not only teach computer-assisted methods to history students, but also how to enable them to conceptualize a historical research project and how to solve technical problems along the way, empowering them to develop and apply different methods in a practical and inspiring way. In the following, I will discuss an approach that proposes designing semester-long courses with a thematic focus, where students progressively learn how to use computational tools through continuous engagement with a historical source.

## Motivation and Course Design
In a text-based field like history, techniques such as text recognition, text/data mining or natural language processing are very valuable for historical analyses [see for example @jockers_text-mining_2016]. However, university courses for history students should go beyond merely teaching a specific technique. They should also equip digital novices with the skills to navigate the digital realm, whether that involves (basic) computer skills, effective collaboration on projects or questions related to data management [for two recent handbooks on how to teach digital history see @battershill_using_2022; @guiliano_primer_2022].

Over the years, I have experimented with various course designs, with introductions to specific software as well as to programming languages. Approaching the topic from the perspective of a course based on programming in order to analyse historical sources, however, has consistently produced the best results in both project outcomes and course evaluations.[^4] Now, with the rise of large language models, it has been argued that AI can easily generate any script, prompting some to question the necessity of teaching programming. For effective use of this technology, though, learning basic programming skills is essential. Relying on AI generated output without understanding its mechanics will result in mistakes, unnoticed misinterpretations, and, eventually, useless research. By learning the basics of scripting, students not only acquire the ability to perform their own analyses on a data set, but also learn how to use generative AI productively, enabling them to critically assess, correct and refine the output.

The current curriculum at the Department of History at the University of Basel does not include foundational, semester-long courses that cover digital literacy or computational skills on a broad basis for all students. Since 2022, however, a self-paced introductory course to digital history has become a mandatory part of the first semester [@serif_introduction_2022]. This course provides an initial overview of digital methods and their use for historical research, along with a practical component where students learn to apply different methods to a corpus. They are gently introduced to the command line,[^3] learning about APIs, regular expressions, string extraction, automation and other relevant techniques, as well as ways to visualize first results. By encountering a computer-based approach to historical sources early in their studies, students become more aware of the subject when planning their courses for the following semesters.

In the absence of a comprehensive introductory course, the digital history courses I offer still begin at a basic level. By showing students the command line as a way to use the computer, I aim to dispel any unfounded fears and encourage a different way of thinking: A task that initially seems overwhelming can be broken down into several small steps, leading to its completion. I let students work on a small multi-step task that increases their motivation and demonstrates the potential relevance of these methods for historical research. The courses are student-project based, and while we also discuss some examples of digital history projects and reflect on the methods used [reading assignments include @romein_state_2020; @graham_exploring_2022; and @lemercier_quantitative_2019], the focus lies on learning by doing, this is by working with their own material, towards the completion of their project.

When developing such a course, I take inspiration from the problems I face. In ongoing research, for example, I am examining book advertisements placed in an early modern newspaper.[^1] One part of this project requires a matching of the advertised titles with an existing database of printed books[^2] in order to enrich the dataset with additional information such as format, number of pages, edition, or genre. To achieve this, one has to overcome a series of obstacles, such as extracting book titles from an advertisement, creating database queries, transforming the received format, handling missing or incorrect metadata, and adding the information to the original dataset. Neither a simple list of commands nor an out-of-the-box solution exists to solve all these problems, encountered while working on the source, at once. However, when tackled individually, each step becomes more understandable and manageable for a programming novice. Using this as a classroom example, programming becomes directly linked to a specific historical source to answer very concrete questions, for example as simple as determining the number of book titles advertised in the newspaper during a particular year.

Through small programming exercises, students learn principles of automation, standardization, scalability, etc.\, while also understanding the importance of metadata and data formats. Examples are drawn from the same historical corpora that will be used in their projects later in the semester, allowing the students to become familiar with both the methods and the sources. This approach helps students gradually understand the potential of computer-assisted analysis and how to apply it to historical research. After being equipped with the technical basics, students form small groups to develop research questions that they would like to explore using digital methods. At the end of the semester, they present their findings, including any challenges faced, discuss how their analyses addressed their initial questions, and reflect on further analyses that could be performed and new questions that arise in light of their results. This structure ensures that students remain connected to the historical material, learn programming not merely for its own sake and identify usefulness and understand limitations of computer-assisted methods in answering historical research questions.

[^1]: The subset of book ads had been created in the context of a SNF project, see <https://avisblatt.philhist.unibas.ch/>.

[^2]: Verzeichnis Deutscher Drucke des 18. Jahrhunderts VD18, <https://vd18.k10plus.de>.

[^3]: Probably most discussed if necessary or not -- I found confirmation for introducing the command line among others in [@blaney_doing_2021].

[^4]: Either R or Python is taught, as both offer a wide range of packages and libraries for humanities data as well as abundant tutorials for different methods.

## Conclusion and Outlook
The described courses provide an opportunity for every student to learn how to use computer-assisted methods for historical research. From the course evaluations we know that the courses have been largely appreciated, and that there is a strong demand for more classes of this kind. Furthermore, a significant portion of the participants come from other humanities disciplines, as their own curricula lack equivalent courses. Admittedly, the learning curve is quite steep, and the pace at the beginning is fast. In the current setting, this is unavoidable, but those who persevere often enjoy experimenting with their new skills and achieve unexpected results.

So far, only few students choose to focus on computational analyses for their bachelor's or master's thesis,[^5] mostly because they do not feel fully confident with their new skill set (and also because potential supervisors often lack sufficient expertise to support them). Consequently, changes in the humanities curriculum seem necessary if we aim to educate more students in digital methods for historical research. With the increasing prominence of large language models, it seems all the more crucial to ensure that future historians can produce verifiable and reproducible results, leveraging computer-assisted methods both effectively and meaningfully.

[^5]: Some of the underlying ideas for the analytic part in the master thesis of @dickmann_topographien_2022 was developed by him in a course of mine in fall 2020, see https://github.com/LarsDIK/avis-analysis.

### References

108 changes: 108 additions & 0 deletions submissions/687/references.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@

@book{battershill_using_2022,
address = {London New York Oxford New Delhi Sydney},
edition = {Second edition},
title = {Using digital humanities in the classroom: a practical introduction for teachers, lecturers, and students},
isbn = {978-1-350-18092-5 978-1-350-18089-5 978-1-350-18090-1},
shorttitle = {Using digital humanities in the classroom},
language = {eng},
publisher = {Bloomsbury Academic},
author = {Battershill, Claire and Ross, Shawna},
year = {2022},
}

@book{blaney_doing_2021,
address = {Manchester},
series = {{IHR} research guides},
title = {Doing digital history: a beginner's guide to working with text as data},
isbn = {978-1-5261-3268-0},
shorttitle = {Doing digital history},
language = {eng},
publisher = {Manchester University Press},
author = {Blaney, Jonathan and Winters, Jane and Milligan, Sarah and Steer, Martin},
year = {2021},
}

@incollection{jockers_text-mining_2016,
title = {Text-{Mining} the {Humanities}},
booktitle = {A {New} {Companion} to the {Digital} {Humanities}},
author = {Jockers, Matthew L. and Underwood, Ted},
editor = {Schreibman, Susan and Siemens, Ray and Unsworth, John},
year = {2016},
pages = {291--306},
}

@book{lemercier_quantitative_2019,
address = {Charlottesville},
title = {Quantitative {Methods} in the {Humanities}. {An} {Introduction}},
isbn = {978-0-8139-4270-4},
shorttitle = {Quantitative methods in the humanities},
publisher = {University of Virginia Press},
author = {Lemercier, Claire and Zalc, Claire},
year = {2019},
}


@book{graham_exploring_2022,
edition = {2},
title = {Exploring {Big} {Historical} {Data}: {The} {Historian}'s {Macroscope}},
isbn = {9789811243035 9789811243042},
shorttitle = {Exploring {Big} {Historical} {Data}},
url = {https://www.worldscientific.com/worldscibooks/10.1142/12435},
language = {en},
urldate = {2022-08-23},
publisher = {WORLD SCIENTIFIC},
author = {Graham, Shawn and Milligan, Ian and Weingart, Scott B and Martin, Kim},
month = mar,
year = {2022},
doi = {10.1142/12435},
}

@article{romein_state_2020,
title = {State of the {Field}: {Digital} {History}},
volume = {105},
issn = {0018-2648, 1468-229X},
shorttitle = {State of the {Field}},
url = {https://onlinelibrary.wiley.com/doi/10.1111/1468-229X.12969},
doi = {10.1111/1468-229X.12969},
language = {en},
number = {365},
urldate = {2022-09-15},
journal = {History},
author = {Romein, C. Annemieke and Kemman, Max and Birkholz, Julie M. and Baker, James and De Gruijter, Michel and Meroño‐Peñuela, Albert and Ries, Thorsten and Ros, Ruben and Scagliola, Stefania},
month = apr,
year = {2020},
pages = {291--312},
}

@book{guiliano_primer_2022,
address = {Durham},
series = {Design principles for teaching history},
title = {A primer for teaching digital history: ten design principles},
isbn = {978-1-4780-1505-5 978-1-4780-1768-4},
shorttitle = {A primer for teaching digital history},
publisher = {Duke University Press},
author = {Guiliano, Jennifer},
year = {2022},
}


@misc{serif_introduction_2022,
title = {Introduction to {Digital} {History}},
url = {https://wissen-ist-acht.github.io/digitalhistory.intro/},
urldate = {2024-09-06},
author = {Serif, Ina},
year = {2022},
}

@phdthesis{dickmann_topographien_2022,
type = {Thesis},
title = {Topographien des {Verlorenen}. {Zur} {Praxis} des {Verlierens} und {Findens} im {Basler} ‘{Avisblatt}’, 1729-1844},
copyright = {info:eu-repo/semantics/closedAccess},
url = {https://edoc.unibas.ch/91836/},
urldate = {2024-09-09},
school = {Universität Basel},
author = {Dickmann, Lars},
year = {2022},
}