Skip to content

intro-to-data-science-21-workshop/14-FedericoMammana-Quanteda

Repository files navigation

Text analysis with quanteda

Summary

This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop run at the Hertie School, Berlin in November 2021. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2021.

Workshop contents

Please click here for our presentation html.

Please click here for the live session exercises html.

This workshop will introduce you to text analysis using the quanteda package. Text analysis is the process of automatically classifying and extracting meaningful information from unstructured text. Quanteda is a fundamental tool to perform text analysis, as well as a variety of other natural language processing tasks such as corpus management, tokenization, and visualization.

Main learning objectives

The goals of this session are to (1) introduce you to pre-processing of text and management of document-feature matrices, (2) try out basic functions on corpora and tokens, and (3) provide you with practice material as well as some further readings.

Instructors

  • Kathryn Malchow
  • Federico Mammana

Further resources

License

The material in this repository is made available under the MIT license.

Statement of contributions

Kathryn Malchow prepared the showcase of functions and tools of the presentation. Prepared the live tutorial.

Federico Mammana prepared the introduction to quanteda and motivation of the presentation. Prepared the live tutorial.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages