How to extract the text from picture using tesseract library
I was trying to extract the text from this picture using tesseract library but it does not seem to work put
here is the code I wrote, I tried to remove the noise from the picture and got the thresholded image
and then I used slicing to get a smaller text of the text only without any noise but it does not seem to work
when i tried cropping the text from the thresholded image my self it worked but I want it to be done in the code, I also tried to
make a mask by separating any black color from the rest of the image but the output was wrong it printed the word 'Wits'
Here is the image I am working on and the mask that I tried to use earlier
Can anyone help?
img = cv2.imread("7.png")
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
kernel = np.ones((1, 1), np.uint8)
img = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
# cv2.imshow("thresh",img)
R= img.shape[0]
C= img.shape[1]
text = img[280:350,135:220]
image = np.ones((R,C))
image[280:350,135:220] = text
cv2.imshow("image",image)
cv2.imwrite("image.png",image)
result = pytesseract.image_to_string(image)
print (result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here I tried to take the region of the text only in another white image and tried to read the text but it also doesn't work, when I took a screenshot of the output image and tried manually it worked so I don't know where is the problem.