img = img1mask + img2(1-mask) How do that ?

merge
alpha

asked 2017-06-16 11:36:04 -0600

carton99
56 ●2 ●4 ●7

updated 2017-06-16 11:38:23 -0600

Hello,

I would like merge two color images with an image mask.

img1 and img2 are color image with 3 channels mask is grey image with 1 channel

for merge the two image with the mask I do a loop for each pixel.

float c1,c2;
for(int j = 0; j < img1.rows ; j++ ){
    for(int i = 0; i < img1.cols ; i++ ){
        c1 = (greyGoodScale.at<uchar>(j, i))/255.0;
        c2 = 1-c1;
        img.at<Vec3b>(j, i)[0] = c2*img1.at<Vec3b>(j, i)[0] + c1*img2.at<Vec3b>(j, i)[0];
        img.at<Vec3b>(j, i)[1] = c2*img1.at<Vec3b>(j, i)[1] + c1*img2.at<Vec3b>(j, i)[1];
        img.at<Vec3b>(j, i)[2] = c2*img1.at<Vec3b>(j, i)[2] + c1*img2.at<Vec3b>(j, i)[2];
    }
}

OK, it's work but my image is 720x500 and i have 70ms of processing time is TOO LONG, I need to be real time. I can't do process on GPU.

Is a way to reduce processing time ?

thank. christophe openCV 3.x

edit retag flag offensive close merge delete

Comments

thank you for your message, but your exemple work it only with binaire mask.

In my case, I work with coefficient value mask.

carton99 ( 2017-06-19 04:18:44 -0600 )edit

You can modify your loop body to make it faster:

const uchar *scale = &greyGoodScale.at<uchar>(j, 0);
uchar *imgdata = &img.at<uchar>(j, 0);
const uchar *imgdata1 = &img1.at<uchar>(j, 0);
const uchar *imgdata2 = &img2.at<uchar>(j, 0);
for(int i = 0; i < img1.cols ; i++ ){
    c1 = scale[i]/255.0;
    c2 = 1-c1;
    int pos = 3*i;
    for (int k = 0; k < 3; k++){
        imgdata[pos + k] = c2*imgdata1[pos + k] + c1*imgdata2[pos + k];
    }
}

To make It even faster you can make conversion table from uchar to double for c1. Just preprocess values from 0/255.0 to 255.0/255.0. And you can preprocess 1 - c1 too.

678098 ( 2017-06-19 05:51:05 -0600 )edit

Why your code is slow? There are 2 reasons: 1)You are using .at, which is slower than pointer access. 2)You are using Vec3b for pixel access. Every time you are writing .at<vec3b>, new object of class Vec3b is being created.

template<typename _Tp, int n> class Vec : public Matx<_Tp, n, 1> {...};
typedef Vec<uchar, 3> Vec3b

678098 ( 2017-06-19 05:59:56 -0600 )edit

add a comment

2 answers

Sort by » oldest newest most voted

answered 2017-06-19 08:36:00 -0600

carton99
56 ●2 ●4 ●7

ok thank you for your response, I have test all solutions.

const uchar *scale = &greyGoodScale.at<uchar>(j, 0);
uchar *imgdata = &img.at<uchar>(j, 0);
const uchar *imgdata1 = &img1.at<uchar>(j, 0);
const uchar *imgdata2 = &img2.at<uchar>(j, 0);
for(int i = 0; i < img1.cols ; i++ ){
    c1 = scale[i]/255.0;
    c2 = 1-c1;
    int pos = 3*i;
    for (int k = 0; k < 3; k++){
        imgdata[pos + k] = c2*imgdata1[pos + k] + c1*imgdata2[pos + k];
    }
}

is one a the best solution with 16ms of processing time.

Now I would like divide the loop into 4 threads. My code is ok :

std::thread t1(firstQuarter...
..
std::thread t4(fourthQuarter...
t1.join;
...
t4.join;

but time is exactly same of one thread even time is upper. It's just classic access array memory and for the out image they don't need mutex because each thread have a quarter aof the image. How free array matrix for multithread for this easy case ?

edit flag offensive delete link

Comments

How are you measuring time? Have you considered granularity of time measurement? When completion time is small, It is right to use loops for precision:

int64 startingTick = cv::getTickCount();
const int repeatsNum = 1000;
for (int i = 0; i < repeatsNum; i++){
    yourMethodToMeasure();
}
double timeDiff = (cv::getTickCount() - startingTick)/cv::getTickFrequency()/repeatsNum;

678098 ( 2017-06-19 08:57:03 -0600 )edit

Hi, I use this in the begining and the end of the complete loop

clock_t t = clock();
double tmilisec = int(1000*double(clock() - t) / CLOCKS_PER_SEC);

I work on video sequence the 16 ms is a average of all images on my sequence

I obtain same result with your method.

carton99 ( 2017-06-19 09:08:11 -0600 )edit

Can thread creation be a problem? Are you making 4 new threads for each image? https://stackoverflow.com/questions/3929774/how-much-overhead-is-there-when-creating-a-thread

678098 ( 2017-06-19 09:33:10 -0600 )edit

Ok, I am wrong. mutlithread the function is ok. thk for all

carton99 ( 2017-06-23 10:59:21 -0600 )edit

add a comment

answered 2017-06-16 11:52:49 -0600

LBerger
9317 ●2 ●20 ●88 http://www.traimaocv.fr

updated 2017-06-19 10:00:27 -0600

try something like this :

 ocl::setUseOpenCL(false);
 Mat img1 = imread("f:/lib/opencv/samples/data/lena.jpg", IMREAD_COLOR);
 Mat img2 = imread("f:/lib/opencv/samples/data/orange.jpg", IMREAD_COLOR);
Mat mask2c, mask1c;
Mat img3,img4;
vector<Mat> pMask = { mask1,mask1,mask1 };
merge(pMask, mask1c);
mask2c=Vec3b(255,255,255)-mask1c;
multiply(img1, mask1c, img3,1, CV_32S);
multiply(img2, mask2c, img4,1, CV_32S);
img4 = img3 + img4;
img4.convertTo(img3, CV_8U,1./255);
t1.stop();
cout<< t1.getTimeMilli()<<"\n";
imshow("test2", img3);
waitKey();

@berak I lost the race because multiply don't use parallelloopbody. Pointer are better here.Opencl is disable too

edit flag offensive delete link

add a comment

img = img1mask + img2(1-mask) How do that ?

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

img = img1*mask + img2*(1-mask) How do that ? edit

Comments

2 answers

Comments

Links

Question Tools

Stats

Related questions

img = img1mask + img2(1-mask) How do that ?