Ask Your Question
3

Thread-creation with parallel_for

asked 2015-08-10 12:37:53 -0600

updated 2020-11-28 06:07:12 -0600

Hey!

I'm trying to speed up an application and want to see if parallel_for can help me. As my first step, I wrote a fancy programm that finds the maximum value per row in an image. This looks like that:

class Parallel_markMax: public cv::ParallelLoopBody
{
private:
    cv::Mat &img_rgb_;
    cv::Mat &img_;

public:
Parallel_markMax(Mat& img, Mat&img_rgb):
    img_rgb_(img_rgb),
    img_(img)
{}

virtual void operator()( const Range& range ) const {

    int h = img_.rows;
    cout << "hell " << range.start << "  "  << range.size() << endl;

    for (int x = range.start; x < range.end; ++x){

        uchar max_val = 0;
        int pos = 0;
        for (int y = 0; y<h; ++y){
            if (img_.at<uchar>(y,x) > max_val){
                max_val = img_.at<uchar>(y,x);
                pos = y;
            }
        }

        cv::circle(img_rgb_, cv::Point(x,pos),3,cv::Scalar(255,0,0),1);
    }
}
};

And i call it like this:

parallel_for_(Range(0,w), Parallel_markMax(img, img_col));

I have 8 threads (result of getNumThreads), so I expected that there will be eight threads with each 1/8 of the range. But I get huge amounts of calls to the operator() with each only a size of 1 to 10. So instead of giving a thread a bigger task, some Threadmanager only assigns very small tasks to each thread which probably leads to much overhead. In my example, I only get a speedup of 3 with 8 cores which is rather bad for a perfectly parallelizable task.

Is there a parameter I miss?

edit retag flag offensive close merge delete

1 answer

Sort by ยป oldest newest most voted
4

answered 2015-08-10 13:16:06 -0600

updated 2015-08-10 13:42:25 -0600

Alright, I figured it out with the help of a colleague:

parallel_for_(Range(0,w), Parallel_markMax(img, img_col),12);

You can pass an aditional parameter that controls the size of the individual tasks. In this case, I create each two threads for six of my eight cores. Two cores are normally used for other stuff, so I don't have to wait for the two last threads to finish when the other six are much faster.

edit flag offensive delete link more

Question Tools

1 follower

Stats

Asked: 2015-08-10 12:37:53 -0600

Seen: 1,031 times

Last updated: Aug 10 '15