This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin on November 4, 2021. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2021.
Anna-Lisa Wirth prepared the (practical) example for the lecture after the formal introduction and the live coding session.
Lukas Warode prepared the presentation slides (using xaringan
) for the first part of the lecture where the basic concepts of regex are explained.
The session will a) introduce the audience to the basic concepts of regular expressions (regex) and b) provide some real-world examples where the usage of regex is very useful. In addition, the goal of the session also includes the application of stringr
functions – combined with regex – as one of the crucial packages within the tidyverse
.
The aim of this session is to (1) familiarize the audience with the basic concepts of regex without overwhelming them. Since regex enables the user to apply very comprehensive and thus complicated string detection mechanisms that are exceeding the scope of this session, the main learning objectives are (2) really about focussing on the fundamentals of using regex in the context of string manipulation with stringr
. In order to demonstrate the power of regex – that comes with its application for detailed and complex purposes – (3) slightly advanced and real article examples are used to present the audience the potential in different string detection tasks.
The material in this repository is made available under the MIT license.