Passports, Travel Docments & ID Cards Scan API Endpoint Available

The PixLab OCR team is pleased to introduce the /docscan API endpoint which let you in a single call scan a government issued document such as passports or various ID cards from certain countries.

Besides its accurate text scanning capabilities, the /docscan API endpoint shall Extract (crop) any detected face and transform the extracted text content such as Passport Machine Readable Zone (MRZ) into valuable information. Below, a typical output result of the /docscan endpoint for a passport input image:

Input Passport Specimen

Input Image URL

Parsed Fields

MRZ Fields

Face extraction is automatically performed using the /facedetect API endpoint. For a general purpose Optical Character Recognition engine, you should rely on the /OCR endpoint instead. If you are dealing with PDF documents, you can convert them at first to raw images via the /pdftoimg endpoint. This endpoint is available starting from the Prod Plan and up.

Below, a typical Python code snippet for scanning passports:

import requests
import json

# Given a government issued passport document, extract the user face and parse all MRZ fields.
#
# PixLab recommend that you connect your AWS S3 bucket via your dashboard at https://pixlab.io/dashboard
# so that any cropped face or MRZ crop is stored automatically on your S3 bucket rather than the PixLab one.
# This feature should give you full control over your analyzed media files.
#
# https://pixlab.io/#/cmd?id=docscan for additional information.

req = requests.get('https://api.pixlab.io/docscan',params={
    'img':'https://i.stack.imgur.com/oJY2K.png', # Passport sample
    'type':'passport', # Type of document we are a going to scan
    'key':'Pixlab_key'
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
else:
    print ("User Cropped Face: " + reply['face_url'])
    print ("MRZ Cropped Image: " + reply['mrz_img_url'])
    print ("Raw MRZ Text: " + reply['mrz_raw_text'])
    print ("MRZ Fields: ")
    # Display all parsed MRZ fields
    print ("\tIssuing Country: " + reply['fields']['issuingCountry'])
    print ("\tFull Name: "       + reply['fields']['fullName'])
    print ("\tDocument Number: " + reply['fields']['documentNumber'])
    print ("\tCheck Digit: "   + reply['fields']['checkDigit'])
    print ("\tNationality: "   + reply['fields']['nationality'])
    print ("\tDate Of Birth: " + reply['fields']['dateOfBirth'])
    print ("\tSex: "           + reply['fields']['sex'])
    print ("\tDate Of Expiry: "    + reply['fields']['dateOfExpiry'])
    print ("\tPersonal Number: "   + reply['fields']['personalNumber'])
    print ("\tFinal Check Digit: " + reply['fields']['finalcheckDigit'])

Finally, the official endpoint documentation is available to consult at https://pixlab.io/cmd?id=docscan and a set of working samples in various programming language are available at the PixLab samples pages.