Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The elusive road to LBP cascade training... help!

I'm fairly new to OpenCV, working on a fun personal project -- using a webcam + OpenCV on a Raspberry Pi to detect birds in my fig tree and trigger a scare device. But I'm hitting serious roadblocks trying to train a simple LBP cascade.

I have a simple test app which runs detectMultiScale on a few test images. Using the standard lbpcascade_frontalface.xml the test successfully detects faces within ~200 milliseconds, which is awesome.

Then I took 11 different photos of a square cookie cutter and cropped them to 24x24, obtained 20 random negative images cropped to 200x200, and created 12 test images at 320x240. See the images here.

I then trained the cascade.xml with:

opencv_traincascade -data data -vec vec -bg bg.txt -featureType LBP -w 24 -h 24

and run my detection with:

bird_cascade.detectMultiScale(gray, birds, 1.1, 2, 0, Size(80, 80));




Results:

  • the xml file is 390KB, very large compared to lbpcascade_frontalface.xml which is 52KB. Why does this much smaller data set result in a much larger xml file??
  • detectMultiScale takes ~42 seconds per image, about 250 times slower than face detection. Again, completely opposite of what I'd expect.
  • detectMultiScale doesn't detect any of the star shapes, even on images which contain the exact shapes from the positives training set... rather it always appears to detect a single 104x104 match in the center of the image, regardless of the image :(

Things I've tried in training which haven't made any difference:

  • specifying numPos 11 and numNeg 20
  • specifying maxFalseAlarmRate of 0.95 (just stabbing in the dark here, haven't found any online docs that explain this very clearly)
  • specifying numStages 10 (this actually reduced the xml file size to 220KB and detection time to 33 seconds, but that's still awful)
  • using a single star image + opencv_createsamples to produce a vec file with 2000 positives, and supplying 100 negatives. This took over 3 hours to train, but the xml is 505KB and detection takes ~70 seconds, with no successes.

As you can see, I've spent a lot of time but am getting nowhere. I've studied several online tutorials/references but none addresses the total failure I'm encountering.


Any of the following would be EXTREMELY helpful:

  • specific insights as to why I'm getting complete failure in my case, and which bits to alter to achieve success.
  • a thorough explanation of the LBP algorithm which would give some intuition for dialing in the mysterious training/detection parameters such as numPos, numStages, minHitRate, maxFalseAlarmRate, scaleFactor, minNeighbors (I'm having trouble making sense of this write-up by Maria Dimashova)
  • pointer to a good starter tutorial for LBP training -- the standard docs are pretty weak
  • I want to start small, but is it even possible to produce a 'test' cascade that doesn't require hundreds or thousands of positives & negatives?

Thanks in advance!
Ken