Full Scan Support for United Arab Emirates (UAE) ID/Residence Cards

The PixLab Document Scanner, development team is pleased to announce that is now fully support scanning Emirates (UAE) ID & Residence Cards via the /DOCSCAN API endpoint at real-time using your favorite programming language.

When invoked, the /DOCSCAN HTTP API endpoint shall Extract (crop) any detected face and transform the raw UAE ID/Residence Card content such as holder name, nationality, ID number, etc. into a JSON object ready to be consumed by your app.

Below, a typical output result of the /DOCSCAN API endpoint for an Emiratis (UAE) ID card input sample:

Input Emirates (UAE) ID Card

UAE ID card specimen

Extracted UAE ID Card Fields

UAE extracted fields

The code samples used to achieve such result are available to consult via the following gists:

The same logic applies to scanning official travel documents like Visas, Passports, and ID Cards from many others countries in an unified manner, regardless of the underlying programming language used on your backend (Python, PHP, Ruby, JS, etc.) thanks to the DOCSCAN API endpoint as shown in previous blog posts:

Algorithm Details

Internally, PixLab's document scanner engine is based on PP-OCR which is a practical ultra-lightweight OCR system, mainly composed of three parts: DB text detection, detection frame correction, and CRNN text recognition. DB stands for Real-time Scene Text Detection.

PP-OCR: A Practical Ultra Lightweight OCR System - Algorithm Overview

PP-OCR Algorithm Overview

The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module.

In PP-OCR, Differentiable Binarization (DB) is used as text detector which is based on a simple segmentation network. It integrates feature extraction and sequence modeling. It adopts the Connectionist Temporal Classification (CTC) loss to avoid the inconsistency between prediction and label.

The algorithm is further optimized in five aspect where the detection model adopts the CML (Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts the LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement, which further improves the inference speed and prediction effect.

Talkie OCR - Image to Speech Now on the App Store

Developed by our colleague Mrad Chams from Symisc Systems and entirely powered by the PixLab OCR API endpoint.

Talkie OCR - Image to Speech

Talkie OCR - A state-of-the-art OCR scanner that practically turn almost any images with human readable characters into text content which is in turn transformed into human voice in your native language & accent. Built in features includes:

  • Automatically Recognize the Input Language & Speaks your Accent: Once the scanned image (Book page, magazine, journal, scientific paper, etc.) recognized & transformed into text content, you'll be able to playback that text in your local accent & over 45 languages of your choice!
  • State of the art OCR processing algorithm powered by PixLab.
  • Speaks over 45 languages with their accents.
  • Built-in translation service to over 30 foreign languages of your choice.
  • Built-in Vision Impaired Mode with the ability to recognize the input language automatically.
  • Playback Pause & Resume at Request.
  • Offline Save for Later Read & Playback.

Download on the App Store Get it on Google Play