Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission 484, Reisacher/Dubey #57

Merged
merged 2 commits into from
Aug 30, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions submissions/poster/484/_quarto.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
project:
type: manuscript

manuscript:
article: index.qmd

format:
html: default
32 changes: 32 additions & 0 deletions submissions/poster/484/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
submission_id: 484
categories: 'Poster Session'
title: Swiss Google Books for Research
author:
- name: Martin Reisacher
orcid: 0009-0008-4529-5291
email: [email protected]
affiliations:
- University of Basel, University Library
- name: Eric Dubey
orcid: 0000-0002-9300-9762
email: [email protected]
affiliations:
- University of Basel, University Library
date: 08-29-2024
---

The UB Bern, ZHB Lucerne, ZB Zurich and UB Basel are digitizing a large part of their holdings from the 18th and 19th centuries in collaboration with Google Books. This digital collection, which is accessible in full text, is intended to offer new possibilities for digital and data-driven research and teaching, e.g. in the context of text and data mining and distant reading.

Due to its size (90 million pages), the collection offers many opportunities, but also presents libraries and researchers with new challenges. Google's algorithms are responsible for image processing, book composition and full-text recognition. Continuous data improvement/changes must therefore be expected when changed algorithms deliver new data versions. This helps to continuously improve quality, but represents a black box that makes it complicated to make transparent statements about the data production processes.

The four partner libraries are currently working on a project (“Google Books for Research”):

* Research and teaching requirements for large digital historical text collections
* State of the art solutions for research-orientated accessibility of large historical text collections
* Data quality and enrichment
* Infrastructure solutions

The central question is how libraries, as cultural and memory institutions, can offer relatively generic infrastructure in the digital space and keep it stable while still being able to use it flexibly enough for very specific research questions and methods.

As part of the poster session, we will present the results of the preliminary project and would like to explore these further with the audience.