Revision history [back]

ok thank you for your response, I have test all solutions.

const uchar *scale = &greyGoodScale.at<uchar>(j, 0);
uchar *imgdata = &img.at<uchar>(j, 0);
const uchar *imgdata1 = &img1.at<uchar>(j, 0);
const uchar *imgdata2 = &img2.at<uchar>(j, 0);
for(int i = 0; i < img1.cols ; i++ ){
    c1 = scale[i]/255.0;
    c2 = 1-c1;
    int pos = 3*i;
    for (int k = 0; k < 3; k++){
        imgdata[pos + k] = c2*imgdata1[pos + k] + c1*imgdata2[pos + k];
    }
}

is one a the best solution with 16ms of processing time.

Now I would like divide the loop into 4 threads. My code is ok :

std::thread t1(firstQuarter...
..
std::thread t4(fourthQuarter...
t1.join;
...
t4.join;

but time is exactly same of one thread even time is upper. It's just classic access array memory and for the out image they don't need mutex because each thread have a quarter aof the image. How free array matrix for multithread for this easy case ?