Skip to content

Tips for Improving OCR Results

Cyril edited this page Jan 6, 2015 · 6 revisions

Important to Know

Tesseract is a library for performing optical character recognition, but it's important to know that Tesseract performs OCR best when it is given a preprocessed image that is ideally crystal clear black text on a pure white background.

The following sections provide some tips about how to preprocess images before running them through Tesseract to improve the result and speed of OCR.

Upstream tips

The upstream Tesseract library has a Wiki page on how to improve the quality of OCR results here: https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality

It's worth reading because it explains the kinds of processing Tesseract does and does not do, which is useful in determining what preprocessing to perform on an image.

Using GPUImage's adaptive threshold

GPUImage is a fantastic image processing library for iOS that filters images on the GPU, so it's really fast. It even comes with a photo camera and a live video camera that you can use to create a pipeline of one or more filters.

You can use GPUImage's GPUImageAdaptiveThresholdFilter to preprocess an image for performing OCR, which "determines the local luminance around a pixel, then turns the pixel black if it is below that local luminance and white if above. This can be useful for picking out text under varying lighting conditions."

Here's some sample code to get you started:

// Grab the image you want to preprocess
UIImage *inputImage = [UIImage imageNamed:@"my_test_image.jpg"];

// Initialize our adaptive threshold filter
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0

// Retrieve the filtered image from the filter
UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];

// send thresholded image to the Tesseract
tesseract.image = filteredImage;