1 | initial version |
Ok after some more months of researching, I think I can formulate an answer to my problem myself. Basically the reason why the negative processing goes slower with each stage is the following.
Simply said, it is something like this:
For each stage the training algorithm has to look for negatives, using the negative set that was passed, that are discriminant enough to still make a difference with all previous trained negative samples. In the beginning this is easy because no negatives have been trained yet. However, the further we go, the more negatives already get classified as negative by the training algorithm and thus the more effort the algorithm must do to find negatives that actually still get wrongly classified throughout the stages.
This also means that you should actually think about your negative training set. If you have 10 images that look exactly the same, then it is better to only supply it once to the algorithm, since the other negatives won't help improve your classifier anymore.
I do think this is a part of cascade classifiers that hasn't been researched enough, trying to define relations between the actual negatives and the training process. Many times people use the words 'the set has to be diverse enough' but thats kind of a blurry description nevertheless.