The first part isn't related to OpenCV, their is plenty of library that can help you to parse XMP metadata, depending on your plateform. I successfully used Exiv2 (Windows/Linux).
For the second part, see this post which could be a good start to the Bags Of Words approach, adapted to your case.