understanding the limitations of traincascade

asked 2015-05-21 03:34:46 -0600

visionAwry
56 ●1 ●6

Hi,

I have been trying to train a cascade classifier for cigarette boxes but seem to get variable success depending on the type of box. If the box is mostly white I don't get a classifier that works. I have attempted multiple trainings both using a script to generate more samples, and also using manually generated samples of the objects. Neither of the methods has worked.

I was hoping someone could help with the following questions, and/or, point me in the direction of reading material that can help to answer why the doesn't work. I also have the following questions on the same, if someone can help:

Does the method of cascade classification have any limitation on the kind of object that can be detected? Does having more features/surfaces increase the chance of detection? eg. comparing a cigarette box Vs a telephone. Are objects with a lot of white harder to detect?
Impact of Pose of objects: I have used training images which are taken from a variety of angles from the object, some directly overhead, some at varying angles from the object. This results in extra faces showing in some images, and not in some. Does the classifier need to have just images of just one pose (eg. only the front face should be clipped and not the rest). I am asking this because a similar car based training that was performed uses only a single pose of the car. I am not sure what this means for objects like boxes/books, which have primarily a front and back face.
Can a cascade classifier be trained for a single type of object, eg. a specific brand of cigarettes, or is it better to train it for cigarette packs in general, and then run object detection to determine brand. I have come across threads where people have talked about training for general object type, and using sub-classifiers to train for only a particular type of that object (eg. flowers, and looking for a particular kind of flower). Are there any limitations for the types of objects that can be trained.
When taking images of rotated objects, part of the image background will always get saved when cropping the image. What is the impact of having a background in the cropped image? I assume that when using createsamples for generation of fake samples, it makes sense to have closely cropped images with no background so that the generated samples are more realistic. I assume this is not required as strictly when using actual samples, and not generated ones.
What is the impact of a large amount of intensity variation in training images? Is it good/bad/ does it depend on the actual images that are supposed to be detected? Natoshi Seo's blog suggests that having fewer illumination and pose variations is better for training faces. However, he does use the script which itself adds illumination and pose variations.
Is it correct to generate positive images using ...

(more)

answered 2015-05-21 06:14:36 -0600

tomnjerry

495 ●2 ●6 ●20

Though there are no strict rules while training cascade for object detection, here are certain results which I have gathered during my experiments with Training Cascade. I will first answer your questions and as and when required will add my points.

The cascade classification method have no limitation of kind of objects it can detect as long as there is some pattern which cascade can figure out during its training phase. It is these features or pattern which trained cascade will try searching in image, when performing object detection. Definately, having more distinct features aid the cascade perform classification task better.
Cascade classifiers are rotation variant i.e they can detect object in same orientation for which they were trained. Change in orientation will adversely effect the perfromance of classifier. Also, training images should have all have similar orientaton. Using images with objects in variety of orientation, will only degrade performance of cascade.
Cascade can be trained for a single class of objects i.e I can have a cascade trained for bikes or car (not brand specific) as most of them would be quite similar having some minor changes in design. Again, more varied the objects are within the class, more it will affect the performance of cascade.
Having objects in varied and complex background will result in more robust cascade. Ideally, only the boundary of objects present in the image is fed to the cascade during training stage. This helps the cascade to be able to distinguish the object from the background. Also, more closely is the background chopped from the object, lesser will be chances of cascade picking up non-object specific features, making a cascade better performer.
The variation in intensity has its own advantages and disadvantages. Including images with intensity variation depends also on the environment in which I want my cascade to be able to detect objects. Eg: If I want cascade to detect objects in varied environments such as dark/light background, during day/night and other such factors it would be advisable to have intensity variations included. Again having large intensity variations may result in cascade not being able to detect patterns specific to object resulting in poor performance. However, if the cascade is expected to detect objects under a specific condition, training it under that specific condition without including much intensity variation will prove more fruitful.
Having individual samples using different background will make cascade more robust i.e it will better ability to detect object in complex backgrounds. Generating positive images from videos wont provide cascade with background variations, though it will serve purpose of having more positive samples required for training cascade.

Having a large number of positive samples and even greater number of negative samples (generally 3-4 times) proves useful. All said, its trial and experiments for your specific application that makes cascade perform well.

edit flag offensive delete link

Comments

Thanks a lot. This information is very useful. I will retry and see if the results improve. There is one more thing, you mention that having varied backgrounds helps, but i don't understand how it will help if we crop the image to not contain the background. So, I assume you mean there should be just a little bit of background in each image?

visionAwry ( 2015-05-23 00:42:29 -0600 )edit

The entire image (object with background) is given to the cascade for training! The .txt file that we make for positive data contains the details of image name alongwith the details of boundary of object. So, you do not use the cropped image, but use the entire image with details of object location. Hope this clears.

tomnjerry ( 2015-05-23 01:06:04 -0600 )edit

Yes, it does. Thanks a lot.

visionAwry ( 2015-05-23 22:43:28 -0600 )edit

I am planning to train cascades with different images with different rotation to complement the non rotational characteristic of the cascade is it effective?

bertumen.wj ( 2016-08-28 20:29:36 -0600 )edit

add a comment

understanding the limitations of traincascade

1 answer

Comments

Links

Question Tools

Stats

Related questions

understanding the limitations of traincascade edit

1 answer

Comments

Links

Question Tools

Stats

Related questions

understanding the limitations of traincascade