Revision history [back]

There is a difference between object detection and feature detection. Object detection is a very difficult task and its results strongly depend on the object you want to detect and image samples you used for training. A Haar cascade for example works good for rigid, non rotated objects taken from the same perspective. This is, why you get good results for faces, because they are rather rigid and usually have the same orientation.

Features describe somehow characteristic image spots or regions like edges or corners. A feature consists of

a keypoint with
- x- and y-coordinates
- scale (for scale invariant features)
- orientation (for orientation invariant features)

and

a descriptor, which is usually a fixed-length vector describing the area defined by the keypoint. Computing image features consists of first detecting keypoints and then extracting their descriptors. There is no training needed for that.

Using features for object detection can be done for example with a method called 'Bag of Words' (BOW). BOW extracts features from the object samples and tries to define a set of features, which are characteristic for that object.

A method, which I find returns impressive results in terms of detection accuracy is the LatentSvmDetector. Unfortunately, OpenCV doesn't provide any methods to train your own objects (yet?). There are a few trained models available like people, cats, airplanes, but I'm afraid there's no rats ;)