Detect presence of text
I'm trying to extract signs from images and classify them. Extracting the signs and doing some classifications already works quite good.
Now I struggle with the simple classification if the sign is completely empty or contains some text. I don't need to do OCR on the text or anything related I just need a simple measure to decide between those two classes. One issue is dirt on the signs I have to classify. Simple thresholding and counting black vs. white pixels doesn't work because pixels counts are sometimes very close.
What would be a good approach to start classification in my case?
Not sure if i've understood clearly your task..but you have to check whether an area is blank or not? If the area is white, since you mentioned black&white count, why not check the average pixel value? Any example image you're dealing with?
Blank in my case could also mean dirty but no text. Sometimes a completely empty but dirty sign ends up with black regions after thresholding that produce more black areas than regular text.