Python: Detecting Textblock And Deleting It From Image (opencv)

July 25, 2024 Post a Comment

I'm currently trying to figure out how to detect a text paragraph on an image in order to remove it. I get an input image, which is similar to the image given above. From there on

Solution 1:

The idea is really simple. Use morphology to isolate the text you want to detect. Using this image, create a mask to delete the region of interest in the input image and produce a final image. All via morphology. My answer is in C++, but the implementation is really easy:

//Read input image:
std::string imagePath = "C://opencvImages//commentImage.png";
cv::Mat imageInput= cv::imread( imagePath );

//Convert it to grayscale:
cv::Mat grayImg;
cv::cvtColor( imageInput, grayImg, cv::COLOR_BGR2GRAY );

//Get binary image via Otsu:
cv::threshold( grayImg, grayImg, 0, 255 , cv::THRESH_OTSU );

Up until this point, you have generated the binary image. Now, let's dilate the image using a rectangular structuring element (SE) wider than taller. The idea is that I want to join all the text horizontally AND vertically (just a little bit). If you see the input image, the “TEST132212” text is just a little bit separated from the comment, enough to survive the dilate operation, it seems. Let's see, here, I'm using a SE of size 9 x 6 with 2 iterations:

cv::Mat morphKernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(9, 6) );
int morphIterations = 2;
cv::morphologyEx( grayImg, grayImg, cv::MORPH_DILATE, morphKernel, cv::Point(-1,-1), morphIterations );

This is the result:

I got a unique block where the original comment was - Nice! Now, this is the largest blob in the image. If I subtract it to the original binary image, I should generate a mask that will successfully isolate everything that is not the “comment” blob:

cv::Mat bigBlob = findBiggestBlob( grayImg );

I get this:

Now, the binary mask generation:

cv::Mat binaryMask = grayImg - bigBlob;

//Use the binaryMask to produce the final image:
cv::Mat resultImg;
imageInput.copyTo( resultImg, binaryMask );

Produces the masked image:

Now, you should have noted the findBiggestBlob function. This is a function I've made that returns the biggest blob in a binary image. The idea is just to compute all the contours in the input image, calculate their area and store the contour with the largest area of the bunch. This is the C++ implementation:

//Function to get the largest blob in a binary image:
cv::Mat findBiggestBlob( cv::Mat &inputImage ){

    cv::Mat biggestBlob = inputImage.clone();

    int largest_area = 0;
    int largest_contour_index=0;

    std::vector< std::vector<cv::Point> > contours; // Vector for storing contour
    std::vector<cv::Vec4i> hierarchy;

    // Find the contours in the image
    cv::findContours( biggestBlob, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE ); 

    for( int i = 0; i< (int)contours.size(); i++ ) {            

        //Find the area of the contour            double a = cv::contourArea( contours[i],false);
        //Store the index of largest contour:if( a > largest_area ){
            largest_area = a;                
            largest_contour_index = i;
        }

    }

    //Once you get the biggest blob, paint it black:
    cv::Mat tempMat = biggestBlob.clone();
    cv::drawContours( tempMat, contours, largest_contour_index, cv::Scalar(0),
                  CV_FILLED, 8, hierarchy );

    //Erase the smaller blobs:
    biggestBlob = biggestBlob - tempMat;
    tempMat.release();
    return biggestBlob;
}

Edit: Since the posting of the answer, I've been learning Python. Here's the Python equivalent of the C++ code:

import cv2
import numpy as np

# Set image path
path = "D://opencvImages//"
fileName = "commentImage.png"# Read Input image
inputImage = cv2.imread(path+fileName)

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Set kernel (structuring element) size:
kernelSize = (9, 6)

# Set operation iterations:
opIterations = 2

# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)

# Perform Dilate:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)

# Find the big contours/blobs on the filtered image:
biggestBlob = openingImage.copy()
contours, hierarchy = cv2.findContours(biggestBlob, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

contoursPoly = [None] * len(contours)
boundRect = []

largestArea = 0
largestContourIndex = 0

# Loop through the contours, store the biggest one:
for i, c in enumerate(contours):

    # Get the area for the current contour:
    currentArea = cv2.contourArea(c, False)

    # Store the index of largest contour:
    if currentArea > largestArea:
        largestArea = currentArea
        largestContourIndex = i

# Once you get the biggest blob, paint it black:
tempMat = biggestBlob.copy()
# Draw the contours on the mask image:
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)

# Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat

# Generate the binary mask:
binaryMask = openingImage - biggestBlob

# Use the binaryMask to produce the final image:
resultImg = cv2.bitwise_and(inputImage, inputImage, mask = binaryMask)

cv2.imshow("Result", resultImg)
cv2.waitKey(0)

Python Dummy

Python: Detecting Textblock And Deleting It From Image (opencv)

Solution 1:

Post a Comment for "Python: Detecting Textblock And Deleting It From Image (opencv)"