Ask Your Question
8

LSH matching very slow

asked 2012-07-30 11:58:51 -0600

janos gravatar image

updated 2012-08-29 04:52:13 -0600

Hey guys and girls,

I'm testing a recognition pipeline using ORB descriptors. I recently changed the matcher from bruteforce to flann, using the new LSH index. Unfortunately the LSH matching is about 5-6 times slower than bruteforce using about 1000 descriptors as a database. Has anybody experienced similar problems? Is the LSH code still under development?

Current git commit hash: 72a4f19.

Part of the code:

  boost::shared_ptr<cv::FlannBasedMatcher> m_matcherPtr;
  cv::Ptr<cv::flann::IndexParams> indexParams = new cv::flann::LshIndexParams(12, 20, 2);
  m_matcherPtr = boost::shared_ptr<cv::FlannBasedMatcher>(new cv::FlannBasedMatcher(indexParams));
  m_matcherPtr->add(m_descriptorsVec);
  m_matcherPtr->train();

  // This step takes very long
  m_matcherPtr->knnMatch(*query, knnmatches, 1);
edit retag flag offensive close merge delete

3 answers

Sort by ยป oldest newest most voted
7

answered 2012-08-21 02:49:00 -0600

yes123 gravatar image

updated 2012-08-27 18:35:12 -0600

1000 descriptors are too few.
I have tried both BF and LSH with about 1'000'000 descriptors and time were something like:

BruteForce: 50sec (knnMatch)
LSH: 2sec         (knnMatch)

I have tuned a bit the LSH param to get that speed, anyway with default param and 1mil descriptors the knnMatch took 4 sec

edit flag offensive delete link more

Comments

Are you saying that LSH with 1000 descriptors is slower than with 1.000.000? How long does your code take for LSH with 1000 descriptors?

Rui Marques gravatar imageRui Marques ( 2012-08-21 05:54:26 -0600 )edit

@rui: i haven't tried it with 1000 descriptors. it's not that usefull to use LSH with so few descriptors

yes123 gravatar imageyes123 ( 2012-08-21 09:01:57 -0600 )edit

Maybe using LSH is not that much better than BF in this case, but i think it should be at least near the performance of the BF matching for small datasets. With the 1000 descriptors i used, LSH needs about 30-40 seconds. Using 1000 times the number of descriptors wont speed up the matching magically. So i guess there has to be some problem with my code or OpenCV. Does your code look similar to the one i use? Could you post it?

janos gravatar imagejanos ( 2012-08-27 10:01:59 -0600 )edit
2

Hi, I implemented the LSH in FLANN/OpenCV and I would concur with yes123: LSH will not give you a speedup for 1000 descriptors and probably for 10000 too. You will first need to change the parameters a lot but even then, there will be so many cache misses (as you have to go in a bunch of random buckets around) that LSH will be slower in that case.

Think about the extreme case of 1 element. With LSH, you need to compute several hash values, get the different bucks located somewhere in RAM, go over them to find the nearest neighbors and compute their Hamming distance. With your scale, most of those would be empty but the algorithm is still way more complex than a few clock ticks.

Vincent Rabaud gravatar imageVincent Rabaud ( 2012-08-27 14:35:30 -0600 )edit
3

answered 2012-08-29 05:48:58 -0600

janos gravatar image

I fixed it. I played a bit with the parameters of LSH and changed them to LshIndexParams(8, 30, 2). Now i'm down to 0.4 seconds for 2000 descriptors, which is fast enough for my problem.

Thanks to everyone!

edit flag offensive delete link more

Comments

In my case, (30, 8, 2) works fine

Waters gravatar imageWaters ( 2013-04-07 17:07:52 -0600 )edit
1

answered 2019-01-09 04:40:54 -0600

pierre_w gravatar image

updated 2019-01-09 04:42:35 -0600

I had the same issue with LSH descriptor matching. I have played around quite a bit with the parameters. I found that the main factor slowing down the process is the use of the multi-probe (third parameter). The multi-probe LSH is a special case of LSH which is meant to improve matching accuracy at the cost of a much higher matching time. The simple LSH (without multi-probe) is much faster (about 10x) and for a low number of descriptors (<10000) the drop in accuracy is not that significant. For 1000x1000 descriptor matching, I get a processing time of about 2 ms on a powerful laptop (about 4.5 ms for brute-force matching). To disable multi-probe, just set the multi-probe parameter (thrid parameter of the LSHIndexParams constructor) to 1.

Lastly, when you call the macthing methods, they always call the train method first. A simple workaround is to derive the FlannBasedMatcher class to make the xxxMatchImpl methods public. But then, it is up to you to make sure that the LSH search index has been trained before you invoke a matching method.

edit flag offensive delete link more

Question Tools

4 followers

Stats

Asked: 2012-07-30 11:58:51 -0600

Seen: 10,435 times

Last updated: Jan 09 '19