-
Notifications
You must be signed in to change notification settings - Fork 948
Tips for Improving OCR Results
Tesseract is a library for performing optical character recognition, but it's important to know that Tesseract performs OCR best when it is given a preprocessed image that is ideally crystal clear black text on a pure white background.
The following sections provide some tips about how to preprocess images before running them through Tesseract to improve the result and speed of OCR.
The upstream Tesseract library has a Wiki page on how to improve the quality of OCR results here: https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality
It's worth reading because it explains the kinds of processing Tesseract does and does not do, which is useful in determining what preprocessing to perform on an image.
GPUImage is a fantastic image processing library for iOS that filters images on the GPU, so it's really fast. It even comes with a photo camera and a live video camera that you can use to create a pipeline of one or more filters.
You can use GPUImage's GPUImageAdaptiveThresholdFilter
to preprocess an image for performing OCR, which "determines the local luminance around a pixel, then turns the pixel black if it is below that local luminance and white if above. This can be useful for picking out text under varying lighting conditions."
Here's some sample code to get you started:
// Grab the image you want to preprocess
UIImage *inputImage = [UIImage imageNamed:@"my_test_image.jpg"];
// Initialize our adaptive threshold filter
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0
// Retrieve the filtered image from the filter
UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];
// send thresholded image to the Tesseract
tesseract.image = filteredImage;