gt-fraktur is the Ground Truth (GT) data for Fraktur/Gothic prints from the 19th Century, released by UB, Uni-Tübingen as Open Data under the CC0 public license.
This repository contains transcriptions of selected pages from 19th Century books as listed below. The original TIFF images used for OCR transcription of the following publications are published on Archive.org under the CC0 public license.
The Shelfmark / DigitalID's of the 19th Century Fraktur prints selected for transcribing:
Details of the page quality issues observed during the transcription process:
# | Shelfmark-DigitalID | Quality Bugs |
---|---|---|
1. | artl_002 | artl_002_00010.tif has bad alignment |
2. | litrdsch_1875 | Misprint |
3. | litrdsch_1875 | Misprint: litrdsch_1875_0146.tif (page 28); line 6-38 in the left column |
4. | thlblb_1866 | Image "thlblb_1866_00037.tif", has a crossed 'o' (eg. ø, Unicode: U+00F8) in the word "Redaction" in multiple places on the page, which were manually corrected to a regular "o" during transcription. |
5. | thlblb_1866 | thlblb_1866_00121.tif , right column - it seems like the long ſ was corrected manually |
6. | thlblb_1866 | thlblb_1866_00425.tif , left column – the word "fünfte" is somehow blurred - seems like there are two "f". |
- This data is is released by UB, Uni-Tuebingen as Open Data under the CC0 public license.