MicroQA/microqa/ocr/__init__.py

15 lines
777 B
Python
Raw Normal View History

2025-11-07 05:41:18 +00:00
"""
This module contains interchangeable engines for optical character recognition,
making it easy to swap implementations in and out based on speed and accuracy
advantages without rewriting business logic.
Each nested module exports a class named `OcrEngine` with a method named
`process()`, which accepts a PIL `Image` and list of languages, and which
returns a tuple containing a standardized `DataFrame` as well as a dictionary
containing any additional specialized metadata made available from the
underlying OCR engine. The `DataFrame` has columns
`["text", "x0", "y0", "x1", "y1"]`, where X and Y coordinates are in pixels
measured from the top left corner of the image. `x1` and `y1` values will be
greater than or equal to the corresponding `x0` and `y0` values.
"""