I wrote a simple python script to process an mp4 video that I took with my phone. The phone was propped up with a clamp, and held perfectly still. It's just looking at my room, there's no motion going on. So I process that video through SIFT, and draw keypoints for every frame then output the video.
Why do some keypoints come and go if it's the same static room in the recording? Is there some uncertainty or randomness inherent in the detector algorithm? Or is it more likely some compression artifacts introduced by the H.264 encoding? Maybe my lighting that's running at 60hz is dimming just enough to periodically cause different frames? I'm not sure but these are the things I'm speculating about.
If I ran SIFT against the same JPEG hundreds of time would you expect to get the same exact keypoints or would some of them come and go as well?
Thanks for any advice you can give, I'm just curious about why this sort of thing happens.