Revision history [back]

Error in gauss-newton implementation for pose optimization

Hello. I’m using a modified version of a gauss-newton method to refine a pose estimate using OpenCV. The unmodified code can be found here: http://people.rennes.inria.fr/Eric.Marchand/pose-estimation/tutorial-pose-gauss-newton-opencv.html

The details of this approach are outlined in the corresponding paper:

Marchand, Eric, Hideaki Uchiyama, and Fabien Spindler. "Pose estimation for augmented reality: a hands-on survey." IEEE transactions on visualization and computer graphics 22.12 (2016): 2633-2651.

A PDF can be found here: https://hal.inria.fr/hal-01246370/document

The part that is relevant (Pages 4 and 5) are screencapped below:

Here is what I have done. First, I’ve (hopefully) “corrected” some errors: (a) dt and dR can be passed by reference to exponential_map() (even though cv::Mat is essentially a pointer). (b) The last entry of each 2x6 Jacobian matrix, J.at<double>(i*2+1,5), was -x[i].y but should be -x[i].x. (c) I’ve also tried using a different formula for the projection. Specifically, one that includes the focal length and principal point:

xq.at<double>(i*2,0)   = cx + fx * cX.at<double>(0,0) / cX.at<double>(2,0);
xq.at<double>(i*2+1,0) = cy + fy * cX.at<double>(1,0) / cX.at<double>(2,0);

Here is the relevant code I am using, in its entirety (control starts at optimizePose3()):

void exponential_map(const cv::Mat &v, cv::Mat &dt, cv::Mat &dR)
{
  double vx = v.at<double>(0,0);
  double vy = v.at<double>(1,0);
  double vz = v.at<double>(2,0);
  double vtux = v.at<double>(3,0);
  double vtuy = v.at<double>(4,0);
  double vtuz = v.at<double>(5,0);
  cv::Mat tu = (cv::Mat_<double>(3,1) << vtux, vtuy, vtuz); // theta u
  cv::Rodrigues(tu, dR);
  double theta = sqrt(tu.dot(tu));
  double sinc = (fabs(theta) < 1.0e-8) ? 1.0 : sin(theta) / theta;
  double mcosc = (fabs(theta) < 2.5e-4) ? 0.5 : (1.-cos(theta)) / theta / theta;
  double msinc = (fabs(theta) < 2.5e-4) ? (1./6.) : (1.-sin(theta)/theta) / theta / theta;
  dt.at<double>(0,0) = vx*(sinc + vtux*vtux*msinc)
                     + vy*(vtux*vtuy*msinc - vtuz*mcosc)
                     + vz*(vtux*vtuz*msinc + vtuy*mcosc);
  dt.at<double>(1,0) = vx*(vtux*vtuy*msinc + vtuz*mcosc)
                     + vy*(sinc + vtuy*vtuy*msinc)
                     + vz*(vtuy*vtuz*msinc - vtux*mcosc);
  dt.at<double>(2,0) = vx*(vtux*vtuz*msinc - vtuy*mcosc)
                     + vy*(vtuy*vtuz*msinc + vtux*mcosc)
                     + vz*(sinc + vtuz*vtuz*msinc);
}


void optimizePose3(const PoseEstimation &pose, 
                   std::vector<FeatureMatch> &feature_matches,
                   PoseEstimation &optimized_pose) {

  //Set camera parameters
  double fx = camera_matrix.at<double>(0, 0); //Focal length
  double fy = camera_matrix.at<double>(1, 1);
  double cx = camera_matrix.at<double>(0, 2); //Principal point
  double cy = camera_matrix.at<double>(1, 2);


  auto inlier_matches = getInliers(pose, feature_matches);

  std::vector<cv::Point3d> wX;
  std::vector<cv::Point2d> x;

  const unsigned int npoints = inlier_matches.size();
  cv::Mat J(2*npoints, 6, CV_64F);
  double lambda = 0.25;
  cv::Mat xq(npoints*2, 1, CV_64F);
  cv::Mat xn(npoints*2, 1, CV_64F);
  double residual=0, residual_prev;
  cv::Mat Jp;


  for(auto i = 0u; i < npoints; i++) {

    //Model points
    const cv::Point2d &M = inlier_matches[i].model_point();

    wX.emplace_back(M.x, M.y, 0.0);


    //Imaged points
    const cv::Point2d &I = inlier_matches[i].image_point();

    xn.at<double>(i*2,0)   = I.x; // x
    xn.at<double>(i*2+1,0) = I.y; // y

    x.push_back(I);
  }


  //Initial estimation
  cv::Mat cRw = pose.rotation_matrix;
  cv::Mat ctw = pose.translation_vector;

  int nIters = 0;

  // Iterative Gauss-Newton minimization loop
  do {
    for (auto i = 0u; i < npoints; i++) {

      cv::Mat cX = cRw * cv::Mat(wX[i]) + ctw;            // Update cX, cY, cZ


      // Update x(q)
      //xq.at<double>(i*2,0)   = cX.at<double>(0,0) / cX.at<double>(2,0);    // x(q) = cX/cZ
      //xq.at<double>(i*2+1,0) = cX.at<double>(1,0) / cX.at<double>(2,0);    // y(q) = cY/cZ
      xq.at<double>(i*2,0)   = cx + fx * cX.at<double>(0,0) / cX.at<double>(2,0);
      xq.at<double>(i*2+1,0) = cy + fy * cX.at<double>(1,0) / cX.at<double>(2,0);

      // Update J using equation (11)
      J.at<double>(i*2,0) = -1 / cX.at<double>(2,0);        // -1/cZ
      J.at<double>(i*2,1) = 0; 
      J.at<double>(i*2,2) = x[i].x / cX.at<double>(2,0);    // x/cZ
      J.at<double>(i*2,3) = x[i].x * x[i].y;                // xy
      J.at<double>(i*2,4) = -(1 + x[i].x * x[i].x);         // -(1+x^2)
      J.at<double>(i*2,5) = x[i].y;                         // y

      J.at<double>(i*2+1,0) = 0;
      J.at<double>(i*2+1,1) = -1 / cX.at<double>(2,0);      // -1/cZ
      J.at<double>(i*2+1,2) = x[i].y / cX.at<double>(2,0);  // y/cZ
      J.at<double>(i*2+1,3) = 1 + x[i].y * x[i].y;          // 1+y^2
      J.at<double>(i*2+1,4) = -x[i].x * x[i].y;             // -xy
      J.at<double>(i*2+1,5) = -x[i].x;                      // -x
    }



    cv::Mat e_q = xq - xn;                     // Equation (7)


    cv::Mat Jp = J.inv(cv::DECOMP_SVD);        // Compute pseudo inverse of the Jacobian

    cv::Mat dq = -lambda * Jp * e_q;           // Equation (10)

    cv::Mat dctw(3, 1, CV_64F), dcRw(3, 3, CV_64F);
    exponential_map(dq, dctw, dcRw);

    cRw = dcRw.t() * cRw;                      // Update the pose
    ctw = dcRw.t() * (ctw - dctw);

    residual_prev = residual;                  // Memorize previous residual
    residual = e_q.dot(e_q);                   // Compute the actual residual

    std::cout << "residual_prev: " << residual_prev << std::endl;
    std::cout << "residual: " << residual << std::endl << std::endl;

    nIters++;

  } while (fabs(residual - residual_prev) > 0);
  //} while (nIters < 30);

  optimized_pose.rotation_matrix = cRw;
  optimized_pose.translation_vector = ctw;
  cv::Rodrigues(optimized_pose.rotation_matrix, optimized_pose.rotation_vector); 
}

Even when I use the functions as given, it does not produce the correct results. My initial pose estimate is very close to optimal, but when I try run the program, the method takes a very long time to converge - and when it does, the results are very wrong. I’m not sure what could be wrong and I’m out of ideas. I’m confident my inliers are actually inliers (they were chosen using an M-estimator). I’ve compared the results from exponential map with those from other implementations, and they seem to agree.

So, where is the error in this gauss-newton implementation for pose optimization? I’ve tried to make things as easy as possible for anyone willing to lend a hand. Let me know if there is anymore information I can provide. Any help would be greatly appreciated. Thanks.