NAME: RAVULA SHIVA KUMAR GMAIL: ravula.shivakumar11@gmail.
com
Preprocessing Task
Source code :
import pytesseract
[Link].tesseract_cmd = r"C:\Program Files\Tesseract-OCR\
[Link]"
import cv2
import numpy as np
from PIL import Image
# Load the image
image_path = "C:/Users/ravul/OneDrive/Desktop/[Link]"
image = [Link](image_path)
# Convert OpenCV image to PIL format
pil_image = [Link]([Link](image, cv2.COLOR_BGR2RGB))
# Perform OCR
text = pytesseract.image_to_string(pil_image)
print(text)
#grayscale
gray = [Link](image, cv2.COLOR_BGR2GRAY)
[Link]("Grayscale", gray)
[Link](0)
#binarizarion
thresh = [Link](gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
[1]
[Link]("Thresholded", thresh)
[Link](0)
#noice Removal
denoised = [Link](thresh, None, 30, 7, 21)
[Link]("Denoised", denoised)
[Link](0)
#Morphological Operations
kernel = [Link]((1, 1), np.uint8)
morph = [Link](denoised, cv2.MORPH_CLOSE, kernel, iterations=1)
[Link]("Morphological", morph)
[Link](0)
#Deskewing (Correcting Skewed Text)
# Deskewing (Correcting Skewed Text)
coords = np.column_stack([Link](thresh > 0))
rect = [Link](coords)
angle = rect[-1]
if angle < -45:
angle += 90
elif angle > 45:
angle -= 90
(h, w) = [Link][:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
deskewed = [Link](image, M, (w, h), flags=cv2.INTER_CUBIC,
borderMode=cv2.BORDER_REPLICATE)
[Link]("Deskewed Image", deskewed)
[Link](0)
[Link]()
#Extract Text After Preprocessing
processed_text = pytesseract.image_to_string(deskewed)
print(processed_text)
Output:
Input image:
Grayscale image
Binarization
noise Removal:
Morphological Operations
Extracted Text After Preprocessing: