Object height detection (single camera)
Dear all,
Since I was not able to find an answer to this question in this forum, I decided to sign up and post it. Out of curiosity, I decided to do a small OpenCV project. The aim is to measure an objects height via a single camera. The camera may safely be assumed to be fixed, equally, the object's distance to the camera is known. Therefore, this equation should be solveable.
Algorithm
Currently my algorithm is as follows: (1) Calibrate Camera (2) Manually choose a point on the video stream which is on the ground (3) Detect object and upper object boundaries (y-coordinates) on the video stream (4) Calculate height: Difference between reference point (3) and upper object boundaries (4) is object's height.
Problem
This seems to work - however there is an error of 3 - 10 centimeters. The error seems to depend (a) on the quality of the calibration, (b) on the location of the object along the videos' x-Achsis (i.e. camera does not seem to be parallel to the ground) and (c) y position on the screen (the higher the object, the larger the error).
As a result, I guess that I am doing something entirely wrong. To be more concrete, I will lay out the steps (1) to (4) in greater detail.
(1) Camera Calibration
Is done via chessboard patterns which each have 26 mm of size. Basically I use an adaption of the Emgu CV (C# Bindings) examples and this link: http://dasl.mem.drexel.edu/~noahKuntz/openCVTut10.html
(2) Manually choose a point that is on the ground
For reasons of convenience I simply click on the x,y-coordinate of the video stream, where the (image of the) ground intersects with the (image of the wall) within my room. Simple enough...
(3) Detect upper object boundaries
Simple feature detection which works well (proven by drawing circles around them).
(4) Calculate height
Here it gets a bit tricky - though my approach is fairly simple. According to http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html (more specifically this formular http://docs.opencv.org/_images/math/69a88b04c61001bf4e198abae39569e8bc3e81c2.png) one should be able to compute real world Y-coordinates by calculating y = (v-c_y) * z/f_y. Using this formular I calculate y_upperBoundary and y_ground in real world coordinates (with respect to the cameras absolute position in real world coordinates, I assume). I provide the following inputs:
- v = y-coorinate on the video stream for the upper boundary of the object or the ground, respectively. Measured in pixels.
- f_y = intrinsic camera parameters at row 1, column 2 (starting to count from 0). Measured in pixels.
- c_y = intrinsic camera parameters at row 1, column 2 (starting to count from 0). Measured in pixels.
- z = real world Z-distance from the camera to the object, measured in mm (the calibration has also been set up in mm). I then calculate (y_ground - y_upperBoundary) / 1000 to get the objects height in mm.
I assume the last step (calculating y_ground minus y_upperBoundary) is necessary, because ...