Normalization of really small numbers
I came across this problem today while calculating Hu invariants for some digits. When the input image was NOT treated as binary, the moments were very small, often much smaller than DBL_EPSILON (which will be important later!). The calculated invariants for each digit filled a row in a Mat. As the result, I obtained a matrix of Hu invariants (column wise) for my digits (row wise). Then I wanted to normalize the invariants column by column to 0-100 range with:
normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);
What I noticed was that most of my 7 columns were normalized properly, but 2 of them were filled with zeros after normalization. That was not right, so I normalized my matrix manually:
for (int c = 0; c < 3; c++)
{
double minV, maxV;
minMaxIdx(A.col(c), &minV, &maxV);
for (int r = 0; r < 3; r++)
C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
}
and the result was as expected.
I had a look inside the 'normalize' method and noticed this:
scale = (dmax - dmin)*(smax - smin > DBL_EPSILON ? 1./(smax - smin) : 0);
which means, that if elements to be normalized are spread over a vary small range, they will not be normalized, but set to zero (or the low end of the requested range). I understand, that this is to prevent numerical errors, but I am not sure if this should be done this way when we are dealing with numbers which are ALL very small?
So I performed a small test. Here is my input matrix:
A
10 5e-016 5e-026
1 5e-020 5e-027
-10 1e-030 -5e-027
The result of OpenCV normalization:
B
100 100 0
55 0.01 0
0 0 0
And my manual result:
C
100 100 100
55 0.01 18.1818
0 0 0
Here is the code I used for the above:
Mat A = (Mat_<double>(3, 3) << 10, 5e-16, 5e-26, 1, 5e-20, 5e-27, -10, 1e-30, -5e-27);
Mat B(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
normalize(A.col(c), B.col(c), 0, 100, NORM_MINMAX);
Mat C(3, 3, CV_64F);
for (int c = 0; c < 3; c++)
{
double minV, maxV;
minMaxIdx(A.col(c), &minV, &maxV);
for (int r = 0; r < 3; r++)
C.at<double>(r, c) = 100 * (A.at<double>(r, c) - minV) / (maxV - minV);
}
cout << "A" << endl;
for (int r = 0; r < 3; r++)
{
for (int c = 0; c < 3; c++)
cout << A.at<double>(r, c) << " ";
cout << endl;
}
cout << endl << "B" <<endl;
for (int r = 0; r < 3; r++)
{
for (int c = 0; c < 3; c++)
cout << B.at<double>(r, c) << " ";
cout << endl;
}
cout << endl << "C" << endl;
for (int r = 0; r < 3; r++)
{
for (int c = 0; c < 3; c++)
cout << C.at<double>(r, c) << " ";
cout << endl;
}
I think it has no important meaning in classification with Hu invariants, as it is quite unlikely to get invariants that differ by less than DBL_EPSILON for different objects (mine were ...