1 | initial version |
This is because you are basically using a cascaded structure for each frame. The input data from your webcam has slight lightning variances in each frame, leading to a possible different detection result. Since the algorithm is combining overlapping detection windows, and a single frame could lead to another amount of overlapping frames, it is possible to have small changes.
If you want it more constant, then apply an averaging filter over frames, fitting a fixed size box to the center of the output.
Cheers!