Skip to content

cisocrgroup/OCR-Workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Licence

Creative Commons License
CIS OCR Workshop by Uwe Springmann, Florian Fink is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Citation

Uwe Springmann et al.. (2016). CIS OCR Workshop v1.0: OCR and postcorrection of early printings for digital humanities. Zenodo. 10.5281/zenodo.46571 DOI

Workshop: OCR and postcorrection of early printings for digital humanities

Centrum für Informations- und Sprachverarbeitung (CIS), Ludwig-Maximilians-Universität München

Announcement

Here you find updated material for the OCR workshop originally held at CIS, LMU at 14/15 September 2015. The original material for the workshop have been archived under the given link.

The workshop consists of 11 modules (M1 to M11) covered in a 2-day course.

Schedule

Day 1 Day 2
10:00-10:20 Welcome, overview 09:00-10:00 M7: Tesseract: Practice
10:20-11:00 M1: Challenges & methods 10:00-11:00 M8: Abbyy Finereader: Practice
11:00-11:30 M2: Image acquisition & preprocessing 11:00-12:00 M9: The CIS error profiling technology
11:30-12:00 M3: Preprocessing: Practice
12:00-13:30 Lunch time 12:00-13:30 Lunch time
13:30-14:30 M4: How to transform incunabula: Theory 13:30-14:30 M10: PoCoTo: Theory
14:30-15:00 Break 14:30-15:00 Break
15:00-16:00 M5: How to transform incunabula: Practice 15:00-16:00 M11: PoCoTo: Practice
16:00-16:30 M6: Other OCR engines: Tesseract, ABBYY 16:00-16:15 Feedback & discussion
16:30-17:00 Wrap-up: day 1 16:15-16:30 Wrap-up
18:30- ?? Dinner, evening sessions ad lib.

Software requirements

Software to install on your laptop before the workshop to enable an active participation in the practice sessions

Additional data

Data needed for the practice sessions are here.

About

Presentations, tutorials and data for the OCR workshop at LMU

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages