A person is engaged in programming

Python Image to Text: Harnessing OCR for Text

Read Time:4 Minute, 45 Second

In today’s digital world, extracting text from images has become an essential task in various business operations, especially when dealing with invoices, receipts, and other types of documents. Optical Character Recognition (OCR) technology has made it possible to convert text from images into machine-encoded text, making it accessible and usable in various applications. In this article, we will explore how to perform this task efficiently using Python.

Setting Up the Environment

To get started with extracting text from images, we need a few essential components:

  1. Tesseract: Tesseract is an open-source OCR engine that allows us to extract text from images;
  2. Python Libraries: We’ll be using two Python libraries:
  • pytesseract: A wrapper for the Tesseract OCR engine;
  • Pillow: A library that adds image processing capabilities to Python.

Installation

First, you need to install Tesseract for your operating system. For Windows users, the latest version of the Tesseract installer can be found online. Download the .exe file and install it on your computer.

If you haven’t already installed the required Python libraries, you can do so by opening the Command Prompt (on Windows) and using the following commands:

pip install pytesseractpip install pillow

Sample Images

For the purpose of this tutorial, we will be working with three sample images, each containing text. These images will serve as our source for extracting text. You can use your images following the same approach.

Extracting Text from a Single Image

Let’s start by extracting text from a single image using Python. In this example, we will work with the first sample image, ‘sampletext1-ocr.png’.

Here is the code structure:

  • All images are placed in the ‘images’ folder;
  • The Python code is in ‘main.py’.

Now, we can extract text from the image using Python:

from PIL import Imagefrom pytesseract import pytesseract
# Define the path to tesseract.exepath_to_tesseract = r’C:\Program Files\Tesseract-OCR\tesseract.exe’
# Define the path to the imagepath_to_image = ‘images/sampletext1-ocr.png’
# Point pytesseract to tesseract.exepytesseract.tesseract_cmd = path_to_tesseract
# Open the image with PILimg = Image.open(path_to_image)
# Extract text from the imagetext = pytesseract.image_to_string(img)
print(text)

Running this code should display the extracted text from the image.

Extracting Text from Multiple Images

In many scenarios, you may need to extract text from multiple images. To achieve this efficiently, we can use Python’s os library to access all the file names in a given directory and then iterate over them to extract text from each image:

from PIL import Imagefrom pytesseract import pytesseractimport os
# Define the path to tesseract.exepath_to_tesseract = r’C:\Program Files\Tesseract-OCR\tesseract.exe’
# Define the path to the images folderpath_to_images = r’images/’
# Point pytesseract to tesseract.exepytesseract.tesseract_cmd = path_to_tesseract
# Get the file names in the directoryfor root, dirs, file_names in os.walk(path_to_images):    # Iterate over each file name in the folder    for file_name in file_names:        # Open the image with PIL        img = Image.open(path_to_images + file_name)
        # Extract text from the image        text = pytesseract.image_to_string(img)
        print(text)

This code will extract text from all the images in the ‘images’ folder and display it.

Programming on a laptop

Comparison Table 

Comparison CriteriapytesseractTesseract
LicenseOpen source (MIT)Open source (Apache 2.0)
Language SupportWide rangeExtensive
Format SupportPNG, JPEG, GIF, etc.Multiple formats
Ease of UseEasy to set upMore complex setup
PerformanceDepends on the imageDepends on the image
Community and SupportActive communityActive community
DocumentationExtensiveExtensive

This table should help you choose the most suitable tool for your image-to-text extraction task using Python.

Video Explanation 

In order to explain this topic in more detail, we have prepared a special video for you. Enjoy watching it!

Conclusion 

In this article, we’ve explored the fascinating world of extracting text from images using Python and two powerful libraries, pytesseract and Tesseract. These tools open up a realm of possibilities for automating data extraction from images, which can be incredibly useful in various industries, from digitizing invoices to processing scanned documents.

Whether you choose pytesseract for its simplicity or Tesseract for its extensive language support, the ability to convert images into machine-encoded text is a valuable skill for any Python developer.

FAQ

1. What is the main difference between pytesseract and Tesseract?

Pytesseract is a Python wrapper for Tesseract, making it more user-friendly and easier to integrate into Python applications. Tesseract is the underlying OCR engine that does the actual text extraction.

2. Can I use these libraries for non-English languages?

Yes, both pytesseract and Tesseract support a wide range of languages, making them suitable for international applications.

3. Are there any limitations to text extraction from images?

While OCR technology has come a long way, it’s important to note that the accuracy of text extraction depends on image quality, font type, and language complexity. Complex fonts and low-quality images may result in errors.

4. Are there any alternatives to pytesseract and Tesseract?

Yes, there are other OCR libraries and services available, such as Google Cloud Vision API, Microsoft Azure Cognitive Services, and Amazon Textract. The choice depends on your specific requirements and budget.

5. How can I improve the accuracy of text extraction from images?

You can enhance accuracy by using high-resolution images, improving image preprocessing techniques (e.g., noise reduction), and selecting the appropriate language settings for the text in the image.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %

Average Rating

5 Star
0%
4 Star
0%
3 Star
0%
2 Star
0%
1 Star
0%

Leave a Reply

Your email address will not be published.

Example of Matrix Inverse Previous post Matrix Inversion Python: Unlocking the Power
A person is engaged in programming Next post Managing Environment Variables in Python