Revision history [back]

Hi Maria:

Thank You so much for Your long and exhaustive reply. I did read it carefully as it deserve a proper attention and concentration. Still I have my doubts, not in the formula itself as that is very clear to me now, but in the methodology . First the fact that –numPos has to be different from the number of sample in the .vec file is not explained in any official documentation like opencv_user.pdf chapter four , and anybody who read, get the message they are the same (we need to update the documentation).

Second, more important, the formula as You say, is not giving any systematic and final way to calculate (given an already existing .vec file from a good dataset of positive), the –numPos value, because we cannot know in advance the value of S in the formula. What we can know is our setting of minHitrate, so if we set minHitRate equal to 0.9999999… it seems we never will get any error as in this way the falseNegativeCount pieces will always be less than one. Another possibility is the already mention one of setting numPos=(0,9 x num_in_vec) or (0,8 x num_in_vec), but that it look to me also a kind of trick without guaranty of success. The big problem here, as You know is the extremely LONG computational time for every stages , and to have the process crashed after few stages means to waste hours of work, and not to have any guaranty that new setting will lead to the “desideratum” final stage, without crashing another time.

What I did, is to utilize OpenCV 2.2 (I downgraded!) where you can set numPos=num_in_vec without problem and get your xml classifiers (somehow). 2.2 version (as an example) at every stages it consumes few and few positive, discharging the FalseNegativeCount pieces that You mention.

But now I ask Your Kind suggestion about what is better to do, after creating a good dataset of lets say 2000 positive, and getting the .vec file what should we do in order not-to-crash the process? Should we set numPos=0.9xNum_in_vec, or minHitRate=0.999999, or use version 2.2 or any better suggestion?

One final suggestion is about a different matter, as I’m working on this for my final Thesis for my master, I’d like to study in deep how is working the code of traincascade with the 3 different features: Haar, LBP and HOG, so I’ll appreciate any starting suggestions and tips from You on start studing the cpp code.

Best regard.

Marco Romagnoli