opencv two algorithms running in parallel
Thanks for reading this post. My program is written in c++ run in visual studio 2017, opencv version 4.0.1 build with tbb and mkl. I am trying to run two instances of similar combination of opencv functions resize, thresholding, morphology open and close and lastly a findcontour. my application scenario is that I capture two frames from two cameras and trying to process them in parallel. One frame when run individually takes about 9 ms to finish the processing, two frames when run sequentially takes 17 ms to process. but i implement this code in parallel using std::thread, processing time doesn't improves but actually adds 1ms of thread creating overhead to it, so it takes 18ms to finish. i have tried the boost library but the results were similar to the std::thread. When i implement tbb task groups, while there is no task creating overhead but the processing time still stays 17ms. I have provided the codes below. I am wondering if there is something i am doing wrong or if this behavior is normal, this kind of processing. Because my expectations where that the process time will decrease to somewhat 9-12 ms while running the code in parallel. but this doesn't work that way.
using std:: thread
void find_c(const Mat &im, Mat &im_c)
{
Mat imm0;
resize(im, imm0, Size(), 0.3, 0.3, INTER_NEAREST); // downscale 2x on both x and y
Mat d1 = Mat::zeros(Size(imm0.cols, imm0.rows), CV_8UC1);
int pic_width = imm0.cols; // width of the resized image for further calculation
d1(Range(470, 470 + 100), Range(0, d1.cols)) = 255;
im_c = imm0.clone();
Mat imm1;
bitwise_and(imm0, d1, imm1);
Mat imm2;
threshold(imm1, imm2, 100, 255, THRESH_BINARY);
Mat imm3;
morphologyEx(imm2, imm3, MORPH_CLOSE, Mat(), Point(-1, -1), 2);
Mat imm4;
morphologyEx(imm3, imm4, MORPH_OPEN, Mat(), Point(-1, -1), 2);
vector<vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours(imm4, contours, hierarchy, RETR_LIST, CHAIN_APPROX_SIMPLE, Point(0, 0));
// for the circle rectangle and other info
vector<vector<Point> > contours_poly(contours.size());
vector<Rect> boundRect(contours.size());
vector<Point2f>center(contours.size());
vector<float>radius(contours.size());
for (int i = 0; i < contours.size(); i++)
{
approxPolyDP(Mat(contours[i]), contours_poly[i], 3, true);
//boundRect[i] = boundingRect(Mat(contours_poly[i]));
minEnclosingCircle((Mat)contours_poly[i], center[i], radius[i]);
}
/// Draw contours
//Mat drawing = Mat::zeros(gray.size(), CV_8UC3);
for (int i = 0; i < contours.size(); i++)
{
circle(im_c, center[i], (int)radius[i] * 1.5, Scalar(255, 255, 255), 2, 8, 0);
cout << "center of the contour No." << i + 1 << "=" << center[i] << endl;
}
}
void find_cx(const Mat &immc, Mat &im_x)
{
Mat imp0;
resize(immc, imp0, Size(), 0.3, 0.3, INTER_NEAREST);
Mat d2 = Mat::zeros(Size(imp0.cols, imp0.rows), CV_8UC1);
int pic_width = imp0.cols; // width of the resized image for further calculation
d2(Range(270, 270 + 200), Range(0, d2.cols)) = 255;
im_x = imp0.clone();
Mat imp1;
bitwise_and(imp0, d2, imp1);
Mat imp2;
threshold(imp1, imp2, 100, 255, THRESH_BINARY);
Mat imp3;
morphologyEx(imp2, imp3, MORPH_CLOSE ...
opencv code is highly parallelized internally already, and creating threads is not for free, either.
Posting a minimal code would make it easier to read and understand what are you trying to achieve...