Talkie OCR - Image to Speech Now on the App Store

Developed by our colleague Mrad Chams from Symisc Systems and entirely powered by the PixLab OCR API endpoint.

Talkie OCR - Image to Speech

Talkie OCR - A state-of-the-art OCR scanner that practically turn almost any images with human readable characters into text content which is in turn transformed into human voice in your native language & accent. Built in features includes:

  • Automatically Recognize the Input Language & Speaks your Accent: Once the scanned image (Book page, magazine, journal, scientific paper, etc.) recognized & transformed into text content, you'll be able to playback that text in your local accent & over 45 languages of your choice!
  • State of the art OCR processing algorithm powered by PixLab.
  • Speaks over 45 languages with their accents.
  • Built-in translation service to over 30 foreign languages of your choice.
  • Built-in Vision Impaired Mode with the ability to recognize the input language automatically.
  • Playback Pause & Resume at Request.
  • Offline Save for Later Read & Playback.

Download on the App Store Get it on Google Play

Introducing the PDF to Image API Endpoint

The PixLab team is pleased to introduce the PDF to Image API endpoint which let you convert any PDF file to a high resolution JPEG/PNG image format.

the /pdftoimg endpoint documentation is available to consult here and below a working Python code sample:

import requests
import json

# Convert a PDF document to JPEG/PNG image via /pdftoimg endpoint.

req = requests.get('https://api.pixlab.io/pdftoimg',params={
  'src':'https://www.getharvest.com/downloads/Invoice_Template.pdf',
  'export': 'jpeg',
  'key':'My_PixLab_Key'
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
else:
    print ("Link to the image output (Converted PDF page): "+ reply['link'])

You can visit the PixLab Github repository for additional code samples in various programming languages.

Generate MEMEs Programmtically

MEMEs are de facto internet standard nowadays. At least, dozen if not hundred of daily top posts on Imgur or Reddit are probably MEMEs. That is, a pop culture image with sarcastic text (always) displayed on Top, Bottom or Center of that image. A lot of web tools out there let you create memes graphically but a few ones actually propose an API for generating memes from your favorite programming language.

In this blog post, we'll try to generate a few MEMEs programmatically using Python, PHP or whatever language that support HTTP requests with the help of the PixLab API but before that, lets dive a little bit into the tools needed to build a MEME generator.

Crafting a MEME API

Building a RESTful API capable of generating memes at request is not that difficult. The most important part is to find a good image processing library that support the annotate operation (i.e. Text drawing). The most capable & open source libraries are the ImageMagick suite and its popular fork GraphicsMagick. Both provides advanced annotate & draw capability such as selecting the target font, its size, text position, the stroke width & height and beyond. Both should be a good fit and up to the task. Here is some good tutorials to follow if you wanna build your own RESTful API:

In our case, we'll stick with the PixLab API due to the fact that is shipped with robust API endpoints such as Image compositing, facial landmarks extraction, dynamic image creation that proves of great help when working with complex stuff such as cloning Snapchat filters or playing with GIFs. So, without further ado, let's start programming some memes..

First MEME

Given an input image of the famous Cool Cat, public domain photo:

Cool CAT face

Draw some funny text on top & bottom of that image to obtain something like this:

CAT Draw

Using the following code:

import requests
import json
# Draw some funny text on top & button of the famous Cool Cat, pubic domain image.
# https://pixlab.io/cmd?id=drawtext is the target command
req = requests.get('https://api.pixlab.io/drawtext',params={
    'img': 'https://pixlab.io/images/jdr.jpg',
    'top': 'someone bumps the table',
    'bottom':'right before you win',
    'cap':True, # Capitalize text,
    'strokecolor': 'black',
    'key':'Pix_Key',
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
else:
    print ("Meme: "+ reply['link'])

make_meme.py/php snippet available on the PixLab Github Repository.

If this is the first time you've seen the PixLab API in action, your are invited to take a look at the excellent introduction to the API in 5 minutes or less. Only one command (API endpoint) is actually needed in order to generate such a meme:

drawtext is the API endpoint used for text annotation. It expect the text to be displayed on Top, Center or Bottom of the target image and support a bunch of other options such as selecting the text font, its size & colors, whether to capitalize the text or not, stroke width & opacity and so on. You can find out all the options the drawtext command takes here.

There is a more flexible command named drawtextat that let you draw text on any desired region of the input image by specifying the target coordinates (X,Y) of where the text should be displayed. Here is an usage example.

Dynamic MEME

This example is similar to the previous one except that the image we'll draw something on top is generated dynamically. That is, we will request from the PixLab API server to create a new image for us with a specified height, width, background color and output format and finally we'll draw our text at the center of the generated image to obtain something like this:

dynamic image

Using this code:

import requests
import json

# Dynamically create a 300x300 PNG image with a yellow background and draw some text on the center of it later.
# Refer to https://pixlab.io/cmd?id=newimage && https://pixlab.io/cmd?id=drawtext for additional information.

req = requests.get('https://api.pixlab.io/newimage',params={
    'key':'My_Pix_Key',
    "width":300,
    "height":300,
    "color":"yellow"
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
    exit();
# Link to the new image
img = reply['link'];

# Draw some text now on the new image
req = requests.get('https://api.pixlab.io/drawtext',params={
    'img':img, #The newly created image
    'key':'My_Pix_Key',
    "cap":True, #Uppercase
    "color":"black", #Text color
    "font":"wolf",
    "center":"bonjour"
})
reply = req.json()
if reply['status'] != 200:
    print (reply['error'])
else:
    print ("Pic location: "+ reply['link'])

dynamic_meme.py/php snippet available on the PixLab Github Repository.

Here, we request a new image using the newimage API endpoint which export to PNG by default but you can change the output format at request. We set the image height, width and the background color respectively to 300x300 with a yellow background color.

Note that if one of the height or width parameter is missing (but not both), then the available length is applied to the missing side and if you want a transparent image, set the color parameter to none.

We finally draw our text at the center of the newly created image using the wolf font, black color and 35 px font size. Of course, one could draw lines, a rectangle for example to surround faces, merge with other images and so forth...

Mimic Snapchat Filters

This last example, although relatively unrelated to our subject here is about to show how to mimic the famous Snapchat filters programmatically. So, given an input image: plain woman face and this eye mask: eye_mask

located at. pixlab.xyz/images/eye_mask.png

plus this mustache: mustache located at. pixlab.xyz/images/mustache.png

output something like this: snapchat filter Well, in order to achieve that effect except for the MEME we draw on the bottom of that image, lots of computer vision algorithms are involved here such as face detection, facial landmarks extraction, pose estimation and so on. You are invited to take a look at our previous blog post on how such filter is produced, what techniques are involved and so on: Mimic Snapchat Filters Programmatically.

Conclusion

Generating MEMEs is quite easy providing a good image manipulation library. We saw that ImageMagick and GraphicsMagick with their PHP/Node.js binding can be used to create your own MEME Restful API. Our simple yet elegant solution is to rely on the PixLab API. Not only generating MEMEs is straightforward but also, you'll be able to perform advanced analysis & processing operations on your input media such as face analysis, nsfw content detection and so forth. Your are invited to take a look at the Github sample page for dozen of the others interesting samples in action such as censoring images based on their nsfw score, blurring human faces, making gifs, etc. All of them are documented on the PixLab API endpoints reference doc and the 5 minutes intro the the API. Finally, if you have any suggestion or critics, please leave a comment below.