Ask Your Question
0

Align 2 photos of text

asked 2015-01-13 10:56:03 -0600

axk gravatar image

updated 2015-01-13 13:18:48 -0600

I've got 2 images of text where the right side of the first image overlaps with the left part of the second (2 partial photos of the same page of text taken from left to right). I'd like to stitch the images and I'm trying an approach with feature matching. I've tried the example with the ORB feature search + brute force feature matching from the site http://docs.opencv.org/trunk/doc/py_t...

It's completely off in my case at lease with the default parameters of the feature search. It looks logical that it would have a hard time in case of text if it uses corners.

How do I match this kind of images with text more reliably with feature matching? Should I specify some different non-default parameters for the ORB search algorithm? use a different algorithm with different parameters?

Mat p1 = new Mat("part1.jpg", LoadMode.GrayScale);
Mat p2 = new Mat("part2.jpg", LoadMode.GrayScale);

var orb = new ORB();

Mat ds1;
var kp1 = DetectAndCompute(orb, p1, out ds1);

Mat ds2;
var kp2 = DetectAndCompute(orb, p2, out ds2);

var bfMatcher = new BFMatcher(NormType.Hamming, crossCheck: true);
var matches = bfMatcher.Match(ds1, ds1);

var tenBestMatches = matches.OrderBy(x => x.Distance).Take(10);

var res = new Mat();
Cv2.DrawMatches(p1, kp1, p2, kp2, tenBestMatches, res, flags: DrawMatchesFlags.DrawRichKeypoints);


using (new Window("result", WindowMode.ExpandedGui, res))
{
    Cv2.WaitKey();
}

private static KeyPoint[] DetectAndCompute(ORB orb, Mat p1, out Mat ds1)
{
    var kp1 = orb.Detect(p1);
    ds1 = new Mat();
    orb.Compute(p1, ref kp1, ds1);
    return kp1;
}

image description

edit retag flag offensive close merge delete

Comments

Why don't you use the build in stitching pipeline from OpenCV instead of rebuilding the whole system manually?

StevenPuttemans gravatar imageStevenPuttemans ( 2015-01-14 02:16:44 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2015-01-13 15:50:55 -0600

updated 2015-01-13 16:21:15 -0600

I see a few possible solutions for your particular task:

  • it seems that simple and fast phase correlate should be enough, see phaseCorrelate() function docs; may be some size normalization should be performed (based on lines interval, letters size etc) or brute force like build size pyramid using small steps like (1.5)^(1/8) = 1.05
  • text is self-similar on the letter size scale and not self-similar on the word size scale, hence you can try to increase zone for key-point descriptor if you want to stay in salient points approach.
edit flag offensive delete link more

Comments

"text is self-similar on the letter size scale and not self-similar on the word size scale, hence you can try to increase zone for key-point descriptor if you want to stay in salient points approach."

Wouldn't that only be possible if the vertical distance would be much greater? If you just increase your zone here, you'd just also include characters in the adjacent rows into your descriptor.

FooBar gravatar imageFooBar ( 2015-01-14 00:50:34 -0600 )edit

@FooBar It is the same page photos, hence it does not matter how many rows in your descriptors -- they will mostly match each other, the bigger descriptors area the smaller false positive match rate.

Vit gravatar imageVit ( 2015-01-14 04:18:22 -0600 )edit

From what I understand about the Phase Correlate algorithm from Wikipedia it works on 2 very similar images, when in my case it's only the overlapping (left and right) parts of the images that are similar. Would this still work?

axk gravatar imageaxk ( 2015-01-14 12:21:21 -0600 )edit

I've tried the phase corellation taking a small rectangular portion of the first image and trying to match it against the second image scanning across the second image and looking for a possition with the smallest displacement vector returned from the phaseCorellate function. From what I see it is prone to false positives in this scenario identifying completely unrelated regions as matching with the displacement of 0.1 pixels.

axk gravatar imageaxk ( 2015-01-14 14:17:01 -0600 )edit

@axk Your images scales differs, so "some size normalization should be performed (based on lines interval, letters size etc) or brute force like build size pyramid using small steps like (1.5)^(1/8) = 1.05". Phase correlate works good for dx+dy shifting only, not for scale change.

Vit gravatar imageVit ( 2015-01-15 00:40:49 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2015-01-13 10:56:03 -0600

Seen: 917 times

Last updated: Jan 13 '15