Skip to content
Change the repository type filter

All

    Repositories list

    • core

      Public
      Collection of OCR-related python tools and wrappers from @OCR-D
      Python
      Apache License 2.0
      3112011718Updated Nov 21, 2024Nov 21, 2024
    • HTML
      Creative Commons Attribution 4.0 International
      724291Updated Nov 11, 2024Nov 11, 2024
    • Website for OCR-D specs, formats, requirements
      HTML
      2500Updated Nov 11, 2024Nov 11, 2024
    • Binarize with Olena/scribo
      Shell
      GNU General Public License v2.0
      8641Updated Nov 7, 2024Nov 7, 2024
    • DFKI Layout Detection for OCR-D
      Python
      Apache License 2.0
      1147181Updated Nov 5, 2024Nov 5, 2024
    • Recognize text using Calamari OCR and the OCR-D framework
      Python
      Apache License 2.0
      613183Updated Oct 29, 2024Oct 29, 2024
    • Wrapper for the kraken OCR engine
      Python
      Apache License 2.0
      61131Updated Oct 28, 2024Oct 28, 2024
    • ocrd_froc

      Public
      Python
      Apache License 2.0
      2760Updated Oct 22, 2024Oct 22, 2024
    • Vue
      Apache License 2.0
      10213Updated Oct 22, 2024Oct 22, 2024
    • OCR-D wrapper for ocr-fileformat
      Shell
      Apache License 2.0
      3460Updated Oct 16, 2024Oct 16, 2024
    • ocrd_all

      Public
      Master repository which includes most other OCR-D repositories as submodules
      Makefile
      MIT License
      1772255Updated Oct 16, 2024Oct 16, 2024
    • Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
      JavaScript
      MIT License
      22100Updated Oct 11, 2024Oct 11, 2024
    • Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)
      Python
      Apache License 2.0
      513101Updated Oct 10, 2024Oct 10, 2024
    • Run ImageMagick with an OCR-D CLI
      Shell
      Apache License 2.0
      3520Updated Oct 1, 2024Oct 1, 2024
    • Simple character-based language model using keras
      Python
      Apache License 2.0
      6710Updated Oct 1, 2024Oct 1, 2024
    • Python
      Apache License 2.0
      1100Updated Oct 1, 2024Oct 1, 2024
    • assets

      Public
      Test data for testing specs and software in @OCR-D
      Makefile
      95186Updated Sep 30, 2024Sep 30, 2024
    • Middleware for running Quiver locally
      Python
      0000Updated Sep 24, 2024Sep 24, 2024
    • Benchmarking OCR-D workflows in Docker
      HTML
      MIT License
      1282Updated Sep 20, 2024Sep 20, 2024
    • OCR-D-compliant page segmentation
      Python
      MIT License
      1567102Updated Sep 5, 2024Sep 5, 2024
    • Run tesseract with the tesserocr bindings with @OCR-D's interfaces
      Python
      MIT License
      1039134Updated Aug 21, 2024Aug 21, 2024
    • spec

      Public
      Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)
      Python
      517429Updated Aug 21, 2024Aug 21, 2024
    • The OCR-D Ground Truth text and structure corpus was created between 2015 -2017. In the years since 2017, this corpus has been further curated and supplemented with metadata where appropriate. The corpus includes page XML files within annotations of the text and structure include.
      Creative Commons Attribution Share Alike 4.0 International
      3500Updated Jul 31, 2024Jul 31, 2024
    • Python
      Creative Commons Zero v1.0 Universal
      1300Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      1001Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024