How to remove those specified label component by fastest method
Cross post here
Suppose I have an such binary image
Mat img(1000, 2000, CV_8UC1);
randu(img, Scalar(0), Scalar(255));
inRange(img, Scalar(160), Scalar(200), img);
I will use connectedComponentsWithStats
to label the img
Mat labels, stats, centroids;
int counts_img = connectedComponentsWithStats(img, labels, stats, centroids);
And I will specify some labels to remove from labels
.
vector<int> drop_label(counts_img - 3);
iota(drop_label.begin(), drop_label.end(), 1);
Of course I can do it with following method:
Mat img(1000, 2000, CV_8UC1);
randu(img, Scalar(0), Scalar(255));
inRange(img, Scalar(160), Scalar(200), img);
Mat labels, stats, centroids;
int counts_img = connectedComponentsWithStats(img, labels, stats, centroids);
vector<int> drop_label(counts_img - 3);
iota(drop_label.begin(), drop_label.end(), 1);
//start to count the time.
double start_time = (double)getTickCount();
Mat select = img.clone() = 0;
int img_height = img.rows, img_width = img.cols;
for (int i = 0; i < img_height; i++) {
int*plabels = labels.ptr<int>(i);
uchar*pselect = select.ptr<uchar>(i);
for (int j = 0; j < img_width; j++) {
if (find(drop_label.begin(), drop_label.end(), plabels[j]) != drop_label.end())
pselect[j] = 255;
}
}
Mat result = img - select;
//total time
double total_time = ((double)getTickCount() - start_time) / getTickFrequency();
cout << "total time: " << total_time << "s" << endl;
total time: 96.8676s
As you see, I can do it indeed, and as I know, the .ptr
is the fastest methd. but I have to say I cannot bear the function find
cost my so many time. Any body can tell me a fastest method to do this?
I don't understand :
can you replace with
The simple algorithm is O(height * weight * length(drop_label list)).
If you have large overlapping labels, then where you set
pselect[j]
to255
you can put abreak
and avoid iterating over any more labels.I'd think the find() at the innermost loop is a pretty expensive operation. If the labels are small and less likely to overlap, then it will likely be faster to iterate over the list of labels at the outermost level, and in each iteration go from the label's minimum i to maximum i, and in that loop the label's minimum j to maximum j.
The overall idea is to try to iterate only where there's actual useful work to be done.
@LBerger Hi guru,
drop_label
include some discontinuous labels maybe, I just make a example for specify my case...Many possibilities : use drop_label as a boolean drop_label[i]=true when you want to set plabel (i component index)
If you have only a small number of label to disable (3) in your example then write in drop_label only this label : you can use find with only three labels.
I think you can use a map instead of vector because access should be faster or may be better unordered_map