docscan

Introducing PixLab’s New PDF APIs: Convert, Scan, and Generate Documents with Vision A

Seamless PDF automation with Vision AI, OCR, and Document Intelligence

We’re excited to announce the release of three powerful new PixLab PDF API endpoints, designed to make PDF workflows smarter, faster, and fully automated. With these additions, developers can now convert PDFs to images, extract text with OCR, scan official ID documents, and programmatically generate PDF documents with ease.

Whether you're building fintech onboarding flows, automated invoice systems, document processing pipelines, or business reporting tools, PixLab’s new PDF APIs are built for you.

✅ `pdftoimg` — Convert PDFs to Images

Convert every page inside a PDF into high-resolution images for preview, analysis, or AI-based processing.

Perfect for: - Document previews & thumbnails
- Feeding pages into Vision LLMs
- AI-based document analysis pipelines

👉 Documentation → https://pixlab.io/endpoints/pdftoimg

✅ `pdfscan` — PDF Text Extraction & OCR

Transform scanned or image-based PDFs into searchable, machine-readable text using PixLab’s OCR engine.

Use Cases: - Invoice and receipt automation
- Document processing
- Enterprise data extraction integrations

👉 Documentation → https://pixlab.io/endpoints/ocr

✅ `genpdf` — Generate PDF Files Programmatically

Create clean, branded PDF documents from HTML, data payloads, or structured templates.

Ideal for: - Invoice generation
- Digital certificates & reports
- Legal and business form creation
- Workflow automation & SaaS platforms

👉 Documentation → https://pixlab.io/endpoints/pdfgen

Built for Modern Document Automation

These Vision APIs are designed to support end-to-end intelligent document pipelines:

✅ Convert → Extract → Analyze → Export
✅ Multi-language OCR + Vision AI
✅ Works with spreadsheets, scanned forms, images, & office docs
✅ Simple API integration across all programming languages

Start Building with PixLab

These endpoints are now live and available to all PixLab users.

👉 Browse all PDF-related endpoints:
https://pixlab.io/api-endpoints

🔐 Get your API key:
https://console.pixlab.io

What’s Coming Next

PDF annotation API suite
Merge / split PDF tools
Structured invoice extraction templates
Vision-powered business form understanding

Document Intelligence for the AI Era

At PixLab, our mission is simple:
Deliver world-class Vision AI and Document Intelligence tools to every developer.

Got feedback or need help getting started?
💬 https://pixlab.io/support

— The PixLab Vision Team

Explore the New PixLab Vision Platform VLM API Endpoints

PixLab’s Vision VLM Platform introduces a groundbreaking set of Vision Language Model (VLM) endpoints that combine natural language processing and computer vision in a simple, developer-friendly API suite.

PixLab Vision Platform

From querying images and parsing complex documents to generating PDFs and extracting ID data, the PixLab VLM API makes it easy to integrate intelligent image and document analysis into your own apps.

Vision Language Model Endpoints

These endpoints allow your application to understand images and video frames with natural language intelligence.

/query – Ask natural language questions about images and receive contextual answers
/describe – Get a natural language description of an image
/tagimg – Retrieve tags describing the image content
/detect – Detect and localize objects in the image
/vocr – OCR via vision models for printed text
/nsfw – Detect explicit content in media

Unstructured Document Parsing

Parse unstructured documents like invoices, receipts, and contracts using VLM-powered tools.

/llm-parse – Extract data from complex document layouts using a user-defined schema

Embedding APIs

Turn your content into machine-understandable vectors for search, indexing, and matching.

/txt-embed – Generate semantic embeddings from raw text
/img-embed – Generate vector embeddings from images

These endpoints are perfect for building your own AI-powered similarity search or recommendation systems.

Image Processing & Background Removal

PixLab also provides classic computer vision capabilities enhanced by AI.

/bg-remove – Remove background or unwanted objects from images
/docscan – Scan and extract JSON data from over 11,000 supported ID document types from 200+ countries
/nsfw – Pixelate or block NSFW content automatically

PDF Generation & Conversion

Create and manipulate PDF documents programmatically using these SDK-free endpoints:

/pdfgen – Generate media-rich PDFs from HTML or Markdown
/pdftoimg – Convert PDF files into image previews

LLM Tool Calling Infrastructure

PixLab provides built-in tools for enhancing your LLM pipeline:

/llm-tool – Get a list of tools callable from your LLM
/tool-call – Execute a tool call based on LLM output

These endpoints enable your LLM agent to execute functions dynamically and return results within the same context.

System Tools & Metadata

Helpful utility endpoints for checking server health and supported formats:

/status – View current system status
/about – Get PixLab version & license info
/extension – Retrieve supported file extensions

Explore the Full API Suite

PixLab offers over 150 RESTful endpoints for vision, media, and document automation tasks. Visit the following links to dive deeper:

Final Thoughts

Whether you're working on an AI productivity suite, eKYC onboarding, or document automation pipeline, PixLab’s VLM API delivers powerful, production-ready tools in minutes. All endpoints are accessible via secure HTTP requests and require no proprietary SDKs.

Get started by signing up for an API key at PixLab Console and explore what's possible with Vision Language Models.

Build smarter apps — faster.

PixLab API 2 Released with new API Portal and Over 150 Endpoints

PixLab Logo

The development team is thrilled to announce the release of PixLab API 2, a massive upgrade to our machine vision and media processing platform with now featuring a brand-new API Portal and over 150 powerful endpoints designed for businesses, developers, and creators.

From document extraction APIs, background removal, and dynamic PDF creation plus a brand new Vision Platform backed by state-of-the-art vision-language models, PixLab API 2 offers unmatched capabilities to analyze, transform, and automate visual content at scale.

👉 Explore the new API Portal now: pixlab.io/api-portal

👉 Explore the comprehensive list of API endpoints and their documentation: pixlab.io/api-endpoints

What’s New in PixLab API 2?

✅ All-New Developer Portal

Modern UI for key management, usage monitoring, and testing API calls in real time.
Comprehensive API Reference & Endpoint List with code samples and quickstart guides for every service.

🔍 Featured API Categories & Endpoints

ID Scan & Extract API

Scan official documents effortlessly with PixLab’s ID Scan & Extract API: - Comprehensive ID Support: Quickly scan & extract data from passports, driver's licenses, and 11,000+ document types from 197+ countries. - Structured JSON Output: Get parsed ID fields (name, DOB, MRZ, face, etc.) in a clean, machine-readable format.

👉 Explore DOCSCAN Documentation

FACEIO – Face Recognition & Authentication

Secure and frictionless passwordless authentication: - Facial Login: Enable seamless access with facial authentication. - Liveness Detection: Block spoofing with advanced anti-fraud checks. - Age Verification: Instantly validate user age for compliance.

👉 FACEIO Integration Guide

🖼️ Background Removal API

Remove image backgrounds with pixel-perfect precision: - Ultra-Fast Processing: Background-free images in seconds. - Automated & Scalable: Integrate into any web or mobile app.

👉 View Background Removal Docs

Try also the bulk version: BG Remove App ↗

NSFW & Content Moderation API

Keep your platform safe and compliant: - Blur, Pixelate, or Block harmful images or frames. - Detect adult, violent, or graphic content using advanced AI models.

👉 Content Moderation Docs

👁️‍🗨️ Vision-Language Models (VLMs)

Check out these insights from documents using Vision + NLP via the Vision Platform: - Document understanding with layout awareness - Great for invoices, contracts, ID cards, and more. Featured new Vision API endpoints:

QUERY - Get natural language responses to image-related queries, and is documented at pixlab.io/endpoints/query
DESCRIBE - Generate a natural language description of image content, and is documented at pixlab.io/endpoints/describe
CHAT - An OpenAI compatible API (chat) interface, and powered by state-of-the-art LLM models. Documentation is available at: pixlab.io/endpoints/chat

👉 and a whole lot more Explore Vision APIs

📄 Rich PDF Generation API

Create beautiful PDFs at scale:

PDFGEN Endpoint: Convert HTML/Markdown to professional PDFs.
PDFTOIMG Endpoint: Preview and convert PDFs to image formats, as documented here.

👉 View Rich PDF Generation APIs

🛠️ Online Tools Backed by PixLab APIs

Explore our growing suite of web apps, each powered by PixLab’s infrastructure:

Convert Box – Universal file format converter (HEIC, MP4, PDF…)
Vision Workspace – AI office suite for OCR & document analysis
AI Photo Editor – Edit photos via text prompts
Annotate – Image annotation & segmentation
BG-Remove – Bulk background remover
App UI Maker – Mobile UI/UX views, interface, and code generator
Tilemap Editor – 2D level editor for games
Creative Toolbox – All-in-one visual creator's suite

For Developers, Creators, and Businesses

PixLab API version 2 is now production-ready, SDK-free, and tightly integrated, making it ideal for:

Fintechs & KYC providers
E-commerce & SaaS platforms
AI developers & researchers
Content creators & automation teams

🔑 Get Your API Key & Start Your Integration Phase

Head to the PixLab Console to:

Generate your first API key
Integrate with over 150+ endpoints
Access complete documentation & live API testing tools

Join thousands of developers and businesses using PixLab to power the next generation of intelligent visual applications.

🧠 Build smarter, automate faster, and scale confidently with PixLab API 2.

— The PixLab Team

Unlocking the Power of DOCSCAN API for Developers: A Unified Solution for ID Scanning

In an era where digital identity verification is critical across industries, PixLab's DOCSCAN API stands out by providing an unparalleled ID scanning solution.

With support for over 11,000 types of identification documents from 197+ countries, DOCSCAN unifies ID Scan into a single, seamless REST API endpoint that is unmatched by other eKYC platforms. Let’s dive into the core capabilities of this powerful tool and why it’s a game-changer for developers.

DOCSCAN ID

Why DOCSCAN is a Must-Have for Developers

The DOCSCAN API Endpoint is more than just an ID scan API. Designed to be developer-friendly with RESTful API architecture, it simplifies the complex process of identifying, extracting, and validating personal data from IDs, passports, driver's licenses, visas, birth & death certificates and more.

Single Endpoint Access
With just one API endpoint, https://api.pixlab.io/docscan, developers can integrate ID scanning capabilities effortlessly into their applications. This endpoint unifies ID scanning, eliminating the need for multiple services or platforms to handle various document types.
Unmatched Document Coverage
DOCSCAN supports 11,097+ documents, including both national and international travel documents. Whether structured with Machine Readable Zones (MRZ) or not, the API efficiently processes them. No other eKYC platform on the market today provides this breadth of coverage.

Key Capabilities of DOCSCAN API Endpoints

Global ID Support
DOCSCAN is designed to handle IDs from nearly every country in the world, making it ideal for global businesses. From passports and residence permits to birth certificates, the API ensures that organizations can onboard users regardless of their location.
RESTful API Architecture
The API’s REST architecture ensures easy integration across platforms and programming languages, including Python, Java, PHP, and JavaScript. With a well-documented REST endpoint, developers can quickly set up the solution without extensive learning curves.
Intelligent Image Handling
DOCSCAN integrates intelligent image correction, automatically adjusting for skew, distortion, or layout variations to enhance scanning accuracy. This ensures that even low-quality images yield accurate extraction of data.
Built-in Face Detection
For applications that require additional security, DOCSCAN offers face detection and extraction.
Privacy-First Processing
PixLab processes all data in-memory, ensuring no user data is stored on servers. This privacy-first design aligns with regulatory compliance, including GDPR, making it a trustworthy choice for sensitive applications.

Streamlined Integration with Code Samples

DOCSCAN offers ready-to-use code samples that are accessible here making integration a breeze for developers of all skill levels. Whether you’re working on a financial service, e-commerce, healthcare, or travel platform, the code examples available in multiple languages help you quickly adopt DOCSCAN into your existing project.

Conclusion

PixLab’s DOCSCAN API Endpoint sets a new standard for identity verification, offering a unified, powerful, and developer-friendly platform. With comprehensive global document coverage and advanced features like face detection, it helps businesses scale their eKYC operations effortlessly.

Whether you're building solutions for financial services, healthcare, travel, or e-commerce, the DOCSCAN API endpoint offers scalable, privacy-first, and seamless ID scanning capabilities. Start leveraging DOCSCAN today to simplify identity verification across your digital platforms.

Explore Further:

PixLab Unveils New eKYC Platform with Unmatched Global ID Document Support

PixLab, a leading provider of advanced image and video analysis solutions, is thrilled to to announce the launch of its revolutionary eKYC platform, a game-changing solution for developers designed to streamline and enhance Know Your Customer (KYC) processes.

DOCSCAN API

This state-of-the-art platform introduces DOCSCAN, a powerful REST API endpoint capable of scanning and processing over 11,000 types of ID documents from more than 197 countries, both with and without Machine Readable Zones (MRZ).

At the heart of this platform is the powerful DOCSCAN API endpoint, which offers unparalleled support for ID document scanning and processing.

Why DOCSCAN is a Game Changer

The DOCSCAN API is a single REST API endpoint that can scan and process over 11,000 types of ID documents, both with and without Machine Readable Zones (MRZ), from more than 197 countries. No other KYC platform has achieved such a milestone, making DOCSCAN an essential tool for developers and system integrator.

Key Features

Comprehensive Document Support: The API supports a vast array of officially issued ID documents, including passports, ID cards, driver's licenses, visas, and birth/death certificates. Learn more about the supported documents and countries.
Single REST API Endpoint: Simplifies the integration process by providing a single endpoint for all your ID scanning and processing needs. Check out the API documentation for detailed information.
Advanced Data Extraction: Automatically extracts essential details such as full name, ID number, date of birth, and address, along with automatic face detection and cropping.
High Volume Processing: Built on a scalable architecture capable of handling millions of API calls monthly, making it ideal for large-scale deployments.

Easy Integration

We understand that seamless integration is crucial for developers. That’s why the DOCSCAN API comes with comprehensive documentation and code samples in multiple programming languages. Whether you’re working with Python, PHP, JavaScript, or Ruby, you’ll find everything you need to get started quickly. Explore our integration guide and start building today.

Real-World Applications

The versatility of the DOCSCAN API makes it suitable for various industries and use cases:

Financial Services: Streamline customer onboarding and ensure compliance with KYC/AML regulations.
Healthcare: Enhance patient identity verification processes.
E-commerce: Prevent fraud and chargebacks by verifying customer identities.
Travel & Hospitality: Simplify check-in processes and enhance security.

Customer Testimonials

Our customers have already experienced the transformative power of the DOCSCAN API:

"The versatility of the DOCSCAN RESTful API made it perfectly fit into our diverse tech stack. It's reliable and incredibly easy to use for our KYC purposes." - Priya, Software Engineer at Daisy Tech

"DOCSCAN is a game changer; it saved us a lot of development time and made the integration process smooth and straightforward." - Mrad, CTO at Symisc Systems

Get Started Today

Ready to elevate your KYC processes? Visit the new PixLab eKYC Platform and explore the DOCSCAN API documentation to get started. Obtain your first API key from the PixLab Console and start integrating today.

For more detailed information and to begin your integration, visit the following links:

PixLab eKYC Platform.
DOCSCAN API Documentation.
Get Your API Key from the PixLab Console.

Transform your KYC and ID Scan processes with PixLab’s innovative eKYC Platform and the powerful DOCSCAN API Endpoint. We can't wait to see what you build!

Press Release Document: https://pixlab.io/pixlab-docscan-ekyc/-press-release.pdf.

PixLab Document Scanner Scores Number 1 as Malaysia's Premier KYC ID Verification (MyKAD) Provider

Kuala Lumpur, Malaysia - In a significant industry recognition, PixLab's Document Scanner has been awarded the title of the top KYC ID Verification provider for MyKAD in Malaysia. This honor underscores the company's commitment to excellence and its continuous drive to provide the best identity verification solutions in the market.

PixLab Document Scanner

PixLab's Document Scanner has set itself apart with its cutting-edge machine learning technology, user-friendly interface, and stringent data security measures. Businesses across Malaysia have lauded the product for its accuracy and efficiency, making it the go-to solution for KYC ID verification.

Why PixLab Stands Out

PixLab's Document Scanner has been recognized as the number one KYC ID Verification provider for MyKAD in Malaysia. This accolade is not just a testament to the product's superior technology but also its commitment to ensuring a seamless user experience.

Advanced Technology: Leveraging state-of-the-art machine learning algorithms, PixLab offers unparalleled accuracy in scanning and verifying MyKAD documents. This ensures that businesses can trust the authenticity of the ID being presented, reducing the risk of fraud.
User-Centric Design: Understanding the importance of a smooth user journey, PixLab has designed its scanner to be intuitive. This means quicker onboarding for customers and less friction in the verification process.
Data Security: In today's world, data breaches are a growing concern. PixLab prioritizes user data security, ensuring that all scanned information is encrypted and stored securely.

Code Samples

Press Release

"We are immensely proud of this recognition," said Mrad Chams, CTO of PixLab. "It reaffirms our dedication to providing the best solutions to our users. We understand the critical role identity verification plays in today's digital landscape, and PixLab strive to offer a product that is both reliable and easy to use".

As PixLab continues to innovate, the industry and its users can expect even more advanced features and enhanced user experience in the future. With its eyes set on global expansion, PixLab is on a trajectory to redefine identity verification standards worldwide.

For more information about PixLab and its award-winning Document Scanner, please visit PixLab's website.

Press Release - PixLab Introduces Groundbreaking Document Scanning API for KYC and ID Verification

FOR IMMEDIATE RELEASE

PixLab, a leading provider of advanced image and video analysis solutions, is thrilled to unveil its newly redesigned and highly innovative Machine Learning-based document scanner engine, developed in-house.

This groundbreaking technology is specifically designed for Know Your Customer (KYC) and ID verification tasks, offering customer onboarding solutions that go beyond standard KYC and Anti-Money Laundering (AML) checks. The PixLab DOCSCAN API Endpoint empowers organizations to boost conversions, reduce fraud, and maintain global compliance effortlessly.

The PixLab DOCSCAN API Endpoint revolutionizes the way government-issued documents, including Passports, Visas, U.S Driver License, and ID cards from various countries including but not limited to Malaysia, Singapore, India, and Emirates, are scanned and verified. With a single call, organizations can effortlessly scan and extract critical information from these documents. The API endpoint also features automatic face extraction, enhancing the accuracy and completeness of the scanning process.

Below, the DOCSCAN API endpoint output for a typical Input Passport Image:

Input Passport Specimen (JPEG/PNG/BMP Image or PDF Upload)

And, the extracted Passport (MRZ) Fields Extract Output

One of the key highlights of the DOCSCAN API endpoint is its ability to transform binary data, such as Passport Machine Readable Zone (MRZ), into a stream of textual content in JSON format. This includes extracting crucial details such as the Full Name, Issuing Country, Document Number, Date of Birth & Expiry, etc.. This seamless integration of the extracted information into your application allows for streamlined and efficient processes, reducing manual effort and eliminating errors.

PixLab takes document scanning and verification to the next level by offering additional features that help identify possible fraudulent documents. The DOCSCAN API endpoint's automated face scanning capabilities, combined with its MRZ extraction functionality, enable developers to automate passport scanning while maintaining stringent security standards. This empowers organizations to protect against fraudulent activities and maintain the integrity of their processes.

Input ID Card Specimen from Malaysia (MyKAD) (JPEG/PNG/BMP Image or PDF Upload) MyKAD Specimen

And, the extracted MyKAD Fields including Face Extraction, Date Of Birth, Full Name, Address, Religion, and ID Number

Extracted MyKAD Fields

In today's increasingly digitized era, the need for automation and efficiency is paramount. Manual and repetitive administrative tasks can be time-consuming, error-prone, and costly. By leveraging the power of the PixLab DOCSCAN API endpoint, organizations can automate passport processing, resulting in substantial cost savings, accelerated customer on-boarding, and enhanced accuracy in administrative processes.

Finally, to learn more about PixLab's DOCSCAN API endpoint and its comprehensive features, please refer to the previous blog posts & code samples:

Code Samples

Scan a government issued Passport document using the PixLab API. Extract the user's face and display all MRZ fields (PHP Code)
Scan a government issued Passport document using the PixLab API. Extract the user's face and display all MRZ fields (Python Code)
Scan Malaysia ID Card (MyKAD) (PHP Code)
Scan government issued ID card from Malaysia (MyKAD, MyKID)., extract the user face and parse all fields (Python Code)

About PixLab

PixLab is a leading provider of advanced image and video analysis solutions, leveraging state-of-the-art technologies such as Machine Learning and Artificial Intelligence. With a commitment to innovation, PixLab empowers organizations across industries to automate and streamline their image and video processing workflows. The company's robust APIs and developer-friendly tools enable businesses to extract valuable insights, perform accurate face recognition, analyze emotions, detect objects, and much more.

PixLab’s Document Scanner now able to scan Driving License issued by any U.S. state

The PixLab Optical Character Recognition team is thrilled to announce that its document scanning API endpoint /DOCSCAN, is now able to scan U.S. Driver’s licenses and driving permits issued by jurisdictions from all the 50 U.S. states.

DOCSCAN API endpoint now supports scanning US driver’s license from all 5O states

The /DOCSCAN API endpoint now allows any Website that is presented with a U.S Driver’s License, International Passport or ID Card to verify that the inputted information by the end user matches those present on the submitted or uploaded ID document image.

Usage & Code Samples

Given an input U.S driver’s license image issued by any of the 50 U.S. states, crop the license holder face, and extract fields of interest as follow:

Input U.S driver’s license image Car Vectors by Vecteezy

Extracted Fields Showcase extracted fields from the submitted driver's image

The extracted fields after successful call to the /DOCSCAN API endpoint are:

License holder cropped face. This image will be stored on an AWS S3 bucket of your choice if you connect your target bucket from the PixLab Console.
Issuing Country (USA obviously).
Issuing State Name.
Issuing State Two-Letter Code.
License Number.
License Holder’s Full Name.
License Holder’s Address.
License Holder’s Date of Birth (yyyy-mm-dd).
License Issuing Date (yyyy-mm-dd).
License Expiry Date (yyyy-mm-dd).
License Holder’s Gender.

The code samples that is used to achieve such results are available via the following Gists:

Python Gist. Also available on the PixLab Gitub Repository:
PHP Gist. Also available on the PixLab Gitub Repository:

Algorithms Under the hood

Face extraction is automatically performed using the /FACEDETECT API endpoint.
/DOCSCAN already supports GET & POST HTTP methods so you can upload your document images directly from your application without relying on a foreign server. Refer to this Gist on how to do so.
Upon the image processed on our server, it is automatically deleted. We do not keep trace or any log of your input images.
Internally, we mainly rely on PP-OCR which is a practical ultra-lightweight OCR system that is mainly composed of three parts: Text Detection, Bounding Box Isolation, & Text Recognition. This combination produces highly accurate results in less than 5 seconds of processing.

Full Scan Support for United Arab Emirates (UAE) ID/Residence Cards

The PixLab Document Scanner, development team is pleased to announce that is now fully support scanning Emirates (UAE) ID & Residence Cards via the /DOCSCAN API endpoint at real-time using your favorite programming language.

When invoked, the /DOCSCAN HTTP API endpoint shall Extract (crop) any detected face and transform the raw UAE ID/Residence Card content such as holder name, nationality, ID number, etc. into a JSON object ready to be consumed by your app.

Below, a typical output result of the /DOCSCAN API endpoint for an Emiratis (UAE) ID card input sample:

Input Emirates (UAE) ID Card

UAE ID card specimen

Extracted UAE ID Card Fields

UAE extracted fields

The code samples used to achieve such result are available to consult via the following gists:

Python Code Samples for Scanning UAE ID Card: uae_emirates_id_card_scan.py
PHP Code Samples for Scanning UAE ID Card: uae_emirates_id_card_scan.php
PixLab Github Repository: github.com/symisc/pixlab

The same logic applies to scanning official travel documents like Visas, Passports, and ID Cards from many others countries in an unified manner, regardless of the underlying programming language used on your backend (Python, PHP, Ruby, JS, etc.) thanks to the DOCSCAN API endpoint as shown in previous blog posts:

Passports & Travel Document Scan: Blog Announcement & Code Sample.
Malaysia & Singapore ID Card Scan: Blog Announcement & Code Sample.
Aadhar India ID Card Scan: Blog Announcement & Code Sample.

Algorithm Details

Internally, PixLab's document scanner engine is based on PP-OCR which is a practical ultra-lightweight OCR system, mainly composed of three parts: DB text detection, detection frame correction, and CRNN text recognition. DB stands for Real-time Scene Text Detection.

PP-OCR: A Practical Ultra Lightweight OCR System - Algorithm Overview

PP-OCR Algorithm Overview

The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data augmentation, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization to optimize and slim down the models of each module.

In PP-OCR, Differentiable Binarization (DB) is used as text detector which is based on a simple segmentation network. It integrates feature extraction and sequence modeling. It adopts the Connectionist Temporal Classification (CTC) loss to avoid the inconsistency between prediction and label.

The algorithm is further optimized in five aspect where the detection model adopts the CML (Collaborative Mutual Learning) knowledge distillation strategy and CopyPaste data expansion strategy. The recognition model adopts the LCNet lightweight backbone network, U-DML knowledge distillation strategy and enhanced CTC loss function improvement, which further improves the inference speed and prediction effect.

Modern Passport Structure & Bulk Scan APIs

A Passport is a document that almost everyone has at some point in their lives. It is issued by the country’s government to its citizens and mainly being used for traveling purposes. It also serves as proof of nationality, name, and more importantly an Universally Unique ID for its owner.

Modern Passport Structure

Passport Specimen

Many services have been long-time accepting passports as identification documents from their customers to complete their KYC (Know Your Customer) form as required by the legislation in force. This is especially true and enforced for the Finance, HR or Travel sectors. In most cases, a human operator will verify the authenticity of the submitted document and grant validation or reject it.

Things can get really complicated if you have hundreds of KYC forms to checks, but also if your clients differ in nationality. Quickly, you will find yourself drowning in physical copies of passports in different languages that you can not even understand. Let alone the potential legal problems you can face with passport copies laying around the office. This is why, an automated & safe solution for Passports processing is required!

Modern Passport Structure

From the 1980s on wards, most countries started issuing passports containing an MRZ. MRZ stands for the Machine Readable Zone and is usually located at the bottom of the Passport as shown below:

Modern Passport Specimen

Passports MRZ Sample

Passports that contain an MRZ are referred to as MRPs, machine-readable passports (Almost all modern issued Passports have one). The structure of the MRZ is standardized by the ICAO Document 9303 and the International Electro-technical Commission as ISO/IEC 7501-1.

The MRZ is an area on the document that can easily be read by a machine using an OCR Reader Application or API. It’s not important for you to understand how it works, but if you look at it carefully, you will see that it contains most of the relevant information on the document, combined with additional characters and a checksum that can be extracted programmatically and automatically via API as we will see in the next section.

Once parsed, the following information are automatically extracted from the target MRZ and made immediately available to your app, thanks to the /docscan API endpoint:

issuingCountry: The issuing country or organization, encoded in three characters.
fullName: Passport holder full name. The name is entirely upper case.
documentNumber: This is the passport number, as assigned by the issuing country. Each country is free to assign numbers using any system it likes.
checkDigit: Check digits are calculated based on the previous field. Thus, the first check digit is based on the passport number, the next is based on the date of birth, the next on the expiration date, and the next on the personal number. The check digit is calculated using this algorithm.
nationality: The issuing country or organization, encoded in three characters.
dateOfBirth: The date of the passport holder's birth in YYMMDD form. Year is truncated to the least significant two digits. Single digit months or days are perpended with 0.
sex: Sex of the passport holder, M for males, F for females, and < for non-specified.
dateOfExpiry: The date the passport expires in YYMMDD form. Year is truncated to the least significant two digits. Single digit months or days are perpended with 0.
personalNumber: This field is optional and can be used for any purpose that the issuing country desires.
finalcheckDigit: This is a check digit for positions 1 to 10, 14 to 20, and 22 to 43 on the second line of the MRZ. Thus, the nationality and sex are not included in the check. The check digit is calculated using this algorithm.

Automatic Passport Processing

PixLab Logo

Fortunately for the developer wishing to automate Passports scanning, PixLab can automatically scan & extract passport MRZ but also help to detect possible fraudulent documents. This is made possible thanks to the /docscan API endpoint which let you in a single call scan government issued documents such as Passports, Visas or ID Cards from various countries.

Besides extracting MRZ, the /docscan API endpoint shall automatically crop any detected face and transform binary Machine Readable Zone into stream of text content (i.e. full name, issuing country, document number, date of expiry, etc.) ready to be consumed by your app in the JSON format.

Below, a typical output result of the /docscan endpoint for a passport input image:

Input Passport Specimen (JPEG/PNG/BMP Image)

Input Image URL

Extracted MRZ Fields

MRZ Fields

What follow is the gist used to achieve such result:

Other document scanning code samples are available to consult via the following Github links:

Python code for scanning Passports: passport_scan.py.
PHP code for scanning Passports: passport_scan.php.

Face extraction is automatically performed using the /facedetect API endpoint. For a general purpose Optical Character Recognition engine, you should rely on the /OCR API endpoint instead. If you are dealing with PDF documents, you can convert them at first to raw images via the /pdftoimg endpoint.

Conclusion

The era we are in is more digitized than ever. Tasks that are repetitive are slowly being replaced by computers and robots. In many cases, they can perform these tasks faster, with a smaller amount of mistakes and in a more cost-effective manner. At PixLab we focus on building software to replace manual repetitive labor in administrative business processes. The processing and checking of passports can be very time-consuming. Using /docscan to automate your passport processing will enable you to save cost, on-board customers faster and reduce errors in administrative processes.

✅ pdftoimg — Convert PDFs to Images

✅ pdfscan — PDF Text Extraction & OCR

✅ genpdf — Generate PDF Files Programmatically

Built for Modern Document Automation

Start Building with PixLab

What’s Coming Next

Document Intelligence for the AI Era

Vision Language Model Endpoints

Unstructured Document Parsing

Embedding APIs

Image Processing & Background Removal

PDF Generation & Conversion

LLM Tool Calling Infrastructure

System Tools & Metadata

Explore the Full API Suite

Final Thoughts

What’s New in PixLab API 2?

✅ All-New Developer Portal

🔍 Featured API Categories & Endpoints

ID Scan & Extract API

FACEIO – Face Recognition & Authentication

🖼️ Background Removal API

NSFW & Content Moderation API

👁️‍🗨️ Vision-Language Models (VLMs)

📄 Rich PDF Generation API

🛠️ Online Tools Backed by PixLab APIs

For Developers, Creators, and Businesses

🔑 Get Your API Key & Start Your Integration Phase

Why DOCSCAN is a Must-Have for Developers

Key Capabilities of DOCSCAN API Endpoints

Streamlined Integration with Code Samples

Conclusion

Explore Further:

Why DOCSCAN is a Game Changer

Key Features

Easy Integration

Real-World Applications

Customer Testimonials

Get Started Today

Why PixLab Stands Out

Related Articles

Code Samples

Press Release

Code Samples

About PixLab

Usage & Code Samples

Algorithms Under the hood

Further Reading

Algorithm Details

Modern Passport Structure

Automatic Passport Processing

Conclusion

✅ `pdftoimg` — Convert PDFs to Images

✅ `pdfscan` — PDF Text Extraction & OCR

✅ `genpdf` — Generate PDF Files Programmatically