CIS OCR Workshop by Uwe Springmann, Florian Fink is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Uwe Springmann et al.. (2016). CIS OCR Workshop v1.0: OCR and postcorrection of early printings for digital humanities. Zenodo. 10.5281/zenodo.46571
Here you find updated material for the OCR workshop originally held at CIS, LMU at 14/15 September 2015. The original material for the workshop have been archived under the given link.
The workshop consists of 11 modules (M1 to M11) covered in a 2-day course.
Day 1 | Day 2 | ||
---|---|---|---|
10:00-10:20 | Welcome, overview | 09:00-10:00 | M7: Tesseract: Practice |
10:20-11:00 | M1: Challenges & methods | 10:00-11:00 | M8: Abbyy Finereader: Practice |
11:00-11:30 | M2: Image acquisition & preprocessing | 11:00-12:00 | M9: The CIS error profiling technology |
11:30-12:00 | M3: Preprocessing: Practice | ||
12:00-13:30 | Lunch time | 12:00-13:30 | Lunch time |
13:30-14:30 | M4: How to transform incunabula: Theory | 13:30-14:30 | M10: PoCoTo: Theory |
14:30-15:00 | Break | 14:30-15:00 | Break |
15:00-16:00 | M5: How to transform incunabula: Practice | 15:00-16:00 | M11: PoCoTo: Practice |
16:00-16:30 | M6: Other OCR engines: Tesseract, ABBYY | 16:00-16:15 | Feedback & discussion |
16:30-17:00 | Wrap-up: day 1 | 16:15-16:30 | Wrap-up |
18:30- ?? | Dinner, evening sessions ad lib. |
Software to install on your laptop before the workshop to enable an active participation in the practice sessions
Data needed for the practice sessions are here.