OCR performance improved

As requested by our users, our /OCR endpoint gets more support for various languages including Arabic, Modern Hebrew, Russian & simplified Chinese.

bounding box coordinates are now enabled by default. For each request, besides the full text output, you get a bbox array where each entry of this array hold the target word and its bounding box (i.e. rectangle) coordinates. Each entry in this array is identified by an instance of the following JSON object:


{
    word: Extracted word,
    x: X coordinate of the top left corner,
    y: Y coordinate of the top left corner,
    w: Width of the rectangle that englobe this word,
    h: Height of the rectangle that englobe this word
}

The documentation is updated and available to consult at https://pixlab.io/cmd?id=ocr and a Python sample is provided on Github at https://github.com/symisc/pixlab/blob/master/python/ocr.py.

With that in hand, you can further tune your analysis phase for example by extracting each word via /crop and perform another pass if desired.


Author: Root

Master of AI (Not related to Mr Wayne)

Comments on “OCR performance improved”