TBB parallel_for vs std::thread
Hi,
I'm starting with parallel processing in OpenCV an wonder why I should use parallel_for
(from TBB) instead of just using multiple std::thread
s. As I understood the parallel_for functionality, you have to create a class extending cv::ParallelLoopBody
with a method signature of void operator()(const cv::Range& range)
. This is where the processing then happens. But you cannot pass any arguments to this function, nor can you parametrise your parallel function in any way. All you have is this range of your thread, and the arguments you passed to the cv::ParallelLoopBody
instance, which are the same for each thread. So you'd have to sort out your arguments with that range, e.g. passing a vector of Images to the cv::ParallelLoopBody
instance and then using the range to extract the one you need. You'd have to do so for every single parameter that is thread-dependend.
So what's the benefit then compared to threads? I can bind
any arbitrary function with (almost) arbitrary parameters with boost or C++11, without creating new classes for each task to be parallized. For this purpose I wrote a very primitive thread pool manager (.hpp, .cpp). Anything wrong with that?
cheers, stfn
P.S. I'm not an threading expert. I know there are memory access concerns when the functions I'm threading are using the same memory for writing. Reading is not the problem, but when two function write simultaniously e.g. on the same Mat, what is happening despite probable corrupted data due to race conditions? Is caching triggered, forcing the data to be up to date before writing? More generally: what do I need to take care of in terms of performance and data safety? Are those pitfalls already taken care of in TBB and this is why it is used in OpenCV?
EDIT: I ended up using tbb::task_group for parallelization and load balancing. Works like a charm.
you can pass additional data e.g. to the constructor of the class.
yes. I mentioned that. I also mentioned, that this data will be the same for each thread. Which is the problem :)
oh, i see. sorry misread it then.
I don't know much about std::threads but parallel_for_ (<- please use this function - not parallel_for) is meant like a wrapper-class and supports TBB / openmp and some more, so it gives more flexibiliity on the threading-library underneath. However, you are right, imho it should also support C++11 threads, maybe it will in the future.
Note that you can use tbb::parallel_for, which is much simpler than the OpenCV implementation, as it doesn't require any wrapper class. You can use it to parallelize simply image filters on lines.
See my exampel code here: http://answers.opencv.org/question/90...