Ask Your Question
2

Detect boxes on shelf Android OpenCV

asked 2016-01-29 08:41:00 -0600

panc gravatar image

updated 2016-02-01 14:11:53 -0600

Hi all, i'll repost here a question posted on StackOverflow (without real responses), I'll hope i'll be luckier here!

I'm developing an Android app that recognizes, from a store shelf image, all the boxes (products) present on the shelf.

My approach so far is the following:

  1. grayscale
  2. bilateral filter (or GaussianBlur, but I've found that using bilateral filter is better to preserve edges)
  3. adaptive threshold
  4. dilate (don't know if it's necessary)
  5. canny
  6. findContours

So, if the source image is simple, (like a b/w drawing of a shelf and some boxes) it can detect them, but with real images of shelves it isn't working.

The main problem is that single boxes have different "foreground" colors and logos, and my steps detect also all the edges of the "inner" box (i.e., the colours inside the edges of the box) and gives me totally nosense results. For simplicity I'll show belows my intermediate results and the source image:

1) source

source

2) grayscale

3) filtered (gaussianBlur in this case)

4) adaptive threshold

5) dilation

6) canny

image description

As you can see, because I cannot remove the foreground of each box, also all the edges given by the logos or text come into play and noises my results.

How can I overcome this problem?

My ony idea is trying to "remove" or decolouring the inner boxes, but i don't konw how to do it! Thanks to all!

P.S. please don't reply with links relative to the already tried tutorials found on this website. They didn't helped me solving the probem. Thanks!

EDIT:

A possible solution to that specific image could be (is really raw in this example cause i've done it manually ;) ): image description

And this app has to detect boxes and cans possibly, because ithout perspective (i.e. assume that I have images in front of the shelf), cans are viewed as boxes.

edit retag flag offensive close merge delete

Comments

1

i see cans, but no box at all ;)

berak gravatar imageberak ( 2016-01-29 08:54:42 -0600 )edit

please post a picture represents what you want to get.

sturkmen gravatar imagesturkmen ( 2016-01-29 09:01:25 -0600 )edit

Sorry, I'm italian, and for my english cans are like boxes! ahah, yeah , i'll post a possible output.

panc gravatar imagepanc ( 2016-01-29 10:00:21 -0600 )edit
1

^^ naw, i'm trying to hint at that the outline of a can might not be best represented by a box, which is more a conceptual problem ;)

berak gravatar imageberak ( 2016-01-29 10:06:36 -0600 )edit

Yes, i understood, but i think that, assuming that i take an image quite perfectly in front of a shelf, they can be viewed as boxes, because there is not a perspective on the circled shape! :)

panc gravatar imagepanc ( 2016-01-29 10:16:32 -0600 )edit

2 answers

Sort by » oldest newest most voted
3

answered 2016-01-30 08:37:52 -0600

updated 2016-02-02 17:41:05 -0600

i suggest the following approach as a starting point

step 1. by using the code below you will get horizontal & vertical mean graphs of the image like:

image description image description

step 2. by using vertical mean graph ( showed on the first image) you can easily get four ROI ( take a look at another question)

step 3. by applying step 1 for each ROI and using horizontal mean graph you can divide each box

image description

try to implement this approach and if you get stuck on some point please ask.

UPDATE 1 see updated code which has result image like

image description

this approach fails on misaligned pictures like below,

so it needs to an alignment process ( i will work on it )

image description


UPDATE 2

you can keep following my further work on github


#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>

using namespace cv;

int main( int argc, char** argv )
{
    Mat img = imread( argv[1] );
    if (img.empty())
        return -1;

    Mat resized,gray,reduced_h,reduced_w;
    resize( img, resized, Size(), 0.25, 0.25 );

    cvtColor( resized, gray, CV_BGR2GRAY );

    reduce( gray, reduced_h, 0, REDUCE_AVG );
    reduce( gray, reduced_w, 1, REDUCE_AVG );

    GaussianBlur( reduced_h, reduced_h, Size(),3);
    GaussianBlur( reduced_w, reduced_w, Size(),3);

    Mat reduced_h_graph = resized.clone();
    Mat reduced_w_graph = resized.clone();

    for ( int i = 0; i < img.cols; i++)
    {
        line( reduced_h_graph,Point(i,0),Point(i,reduced_h.at<uchar>(0,i)),Scalar(255,255,0),1);
    }

    for ( int i = 0; i < img.rows; i++)
    {
        line( reduced_w_graph,Point(0,i),Point(reduced_w.at<uchar>(i,0),i),Scalar(0,255,0),1);
    }

    imshow("reduced_h_graph", reduced_h_graph );
    imshow("reduced_w_graph", reduced_w_graph );
    waitKey(0);
    return 0;
}

UPDATE 1

#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>

using namespace cv;
using namespace std;

vector<Rect> divideHW( Mat src, int dim, double threshold1, double threshold2 )
{
    Mat gray, reduced, canny;

    if( src.channels() == 1 )
    {
        gray = src;
    }

    if( src.channels() == 3 )
    {
        cvtColor( src, gray, CV_BGR2GRAY );
    }

    reduce( gray, reduced, dim, REDUCE_AVG );

    Canny( reduced, canny, threshold1, threshold2, 3, true );

    vector<Point> pts;
    findNonZero( canny, pts);

    vector<Rect> rects;

    Rect rect(0,0,gray.cols,gray.rows);
    int ref_x = 0;
    int ref_y = 0;

    for( size_t i=0; i< pts.size(); i++ )
    {
        if( dim )
        {
           rect.height = pts[i].y-ref_y;
           rects.push_back( rect );
           rect.y = pts[i].y;
           ref_y = rect.y;
           if( i == pts.size()-1 )
           {
             rect.height = gray.rows - pts[i].y;
             rects.push_back( rect );
           }
        }

        else
        {
           rect.width = pts[i].x-ref_x;
           rects.push_back( rect );
           rect.x = pts[i].x;
           ref_x = rect.x;
           if( i == pts.size()-1 )
           {
             rect.width = gray.cols - pts[i].x;
             rects.push_back( rect );
           }
        }

    }
    return rects;
}

int main( int argc, char** argv )
{
    Mat img = imread( argv[1] );
    if (img.empty())
        return -1;

    Mat resized;
    resize( img, resized, Size(), 0.25, 0.25 );

    vector<Rect> rois_h = divideHW( resized, 1, 0, 255 );

    for( size_t i=0; i< rois_h.size(); i++ )
    {
    Mat roi_h = resized( rois_h[i]);


        vector<Rect> rois_w = divideHW( roi_h, 0, 0, 255 );

    for( size_t j=0; j< rois_w.size(); j++ )
    {
    rois_w[j].y += rois_h[i].y;
    rectangle( resized, rois_w[j], Scalar( 0, 255, 0), 1 );
    rois_w[j].x = rois_w[j].x * 4;
    rois_w[j ...
(more)
edit flag offensive delete link more

Comments

Really really interesting...Iìll try this and i'll tell you if it solved my problem!

panc gravatar imagepanc ( 2016-02-01 03:04:14 -0600 )edit

hi, can you explain this part of code?

line( reduced_h_graph,Point(i,0),Point(i,reduced_h.at<uchar>(0,i)),Scalar(255,255,0),1);

I understood that it draws a line, but i don't understand clearly when you write "reduced_h.at<uchar>(0,i).

sorry - I understood

panc gravatar imagepanc ( 2016-02-01 07:00:52 -0600 )edit

are you trying to convert it to java? indeed you don't need to draw graph, it is drawed to show the approach

sturkmen gravatar imagesturkmen ( 2016-02-01 07:05:40 -0600 )edit

I know, I don't have to call "line" as you wrote, but I am porting this code to java and i'd like to know why you set p oints coordinates as done in the code. Is the only piece of code that I don't understand (semantically).

panc gravatar imagepanc ( 2016-02-01 07:32:45 -0600 )edit
1

i will try to implement second step. maybe you will understand well if you see how to implement second step. keep following

sturkmen gravatar imagesturkmen ( 2016-02-01 07:52:14 -0600 )edit
1

Ok, I've managed to get your same solution, and for now it gives me the same results as yours. Now my idea is to use a gradient to detect when theres a huge change in that values produced by reduce (so there's a line) and then plot it. Do you think it could be a good approach? I'm using Sobel derivatives on x and y axes, but for now I cannot compute its values (it returns me only zeroes).

panc gravatar imagepanc ( 2016-02-01 10:07:36 -0600 )edit

hey, i did not think that. thanks.. wait i am trying Canny

sturkmen gravatar imagesturkmen ( 2016-02-01 10:23:45 -0600 )edit

OK, I hope you will have better luck with Canny!

panc gravatar imagepanc ( 2016-02-01 10:40:02 -0600 )edit

yes it works. i will update my answer soon.

sturkmen gravatar imagesturkmen ( 2016-02-01 10:45:25 -0600 )edit

Oh, great! Can you briefy explain the methodology you have followed? not a code description :)

panc gravatar imagepanc ( 2016-02-01 13:05:08 -0600 )edit
0

answered 2016-02-01 14:52:38 -0600

panc gravatar image

updated 2016-02-02 08:28:01 -0600

UPDATE 1

Today I debugged a little and found out some problems that, probably, in c++ don't shows up.

inside this part of divideHW method:

for( int i=0; i< pts_ref.size(); i++ )
    {
        if( dim )
        {
            if(i!=pts_ref.size()-1 ){
            rect.height = (int)(pts_ref.get(i).y-ref_y);
                if(rect.height>100)
                {
                Rect r=rect.clone();
                rois.add(r);
                }
            rect.y = (int)pts_ref.get(i).y;
            ref_y = rect.y;
            }
            else
            {
                rect.height = gray.rows() -(int) pts_ref.get(i).y;
                  if(rect.height>100)
                {
                Rect r=rect.clone();
                rois.add(r);
                }
            }
        }

        else
        {
            if( i != pts_ref.size()-1 )
            {
                rect.width = (int) pts_ref.get(i).x - ref_x;
                if(rect.width>50){

                    Rect r=rect.clone();
                    rois.add(r);

                }
                rect.x = (int) pts_ref.get(i).x;
                ref_x = rect.x;
            }
            else
            {
                rect.width = gray.cols() - (int)pts_ref.get(i).x;
                if(rect.width>50){

                    Rect r=rect.clone();
                    rois.add(r);

                }
            }
        }

    }

I've added a clone() call to our just calculated rect, to avoid that pushing the rect without cloning, once the for loop finishes, all the N entries of rois are equal to the last rect analyzed..

Second, I've managed to control the minimum width and heights of rects, to avoid getting solutions not consistent (i.e. part of the background or part of unwanted boxes.

Now I succesfully can get my boxes, but not exactly. I can get a correct box, or multiple boxes together, or part of boxes (like half a box or a quarter of a box).

I will try now with another test image, then I will test if this approach works even with captured images from camera (taking photo in "perfect" front of a shelf, to avoid perspective as mush as we can).

test 2 image image description

edit flag offensive delete link more

Comments

( to be sure ) i converted your java to c++ line by line. it works well.

did you check what is the result of

    int counter=0;
for(int i=0;i<rois.size();i++)
{
    Mat roi_h=new Mat(original,rois.get(i));
    saveImage(stdDir,roi_h,"product-"+counter+".jpeg");
    counter++;
}
sturkmen gravatar imagesturkmen ( 2016-02-01 15:46:28 -0600 )edit

does the code above save appropriate parts of your image?

sturkmen gravatar imagesturkmen ( 2016-02-01 15:49:54 -0600 )edit

I know that are zero rects because I print a log just after rois= divideHW( clone, true, 0, 255 );printing rois.size(), that gives me 0, so no rects inside.

panc gravatar imagepanc ( 2016-02-01 15:56:53 -0600 )edit
1

what is the result of pts_ref.size() after

findNonZero(canny,pts);
List<Point> pts_ref=pts.toList();
sturkmen gravatar imagesturkmen ( 2016-02-01 16:12:12 -0600 )edit

zero! it can be a type error, because you use vector<point> as 2nd parameter of findNonZero, but Java's impementation of findNonZero accepts a Mat (not properly a MatOfPoint, but i don't know how to convert a Mat in a List<point> in other methods).

panc gravatar imagepanc ( 2016-02-01 16:24:44 -0600 )edit

UPDATE: after few logs, I've found that after canny, canny.cols() gives me count 1. strange

panc gravatar imagepanc ( 2016-02-01 16:32:10 -0600 )edit
1

canny.cols() 1 is as it must be

reduce( gray, reduced, dim, REDUCE_AVG );
Canny( reduced, canny, threshold1, threshold2, 3, true );
std::cout << canny.cols  << endl;

result is : canny.cols() must be 1 in first step . in the second step it must be equal to image width

reduce(gray, reduced, value, REDUCE_AVG); is REDUCE_AVG acceptable in java ( REDUCE_AVG = 1)

sturkmen gravatar imagesturkmen ( 2016-02-01 16:44:12 -0600 )edit

I think it is acceptable, because previously I was able to view the results of reduce , so i think REDUCE_AVG will work well.

panc gravatar imagepanc ( 2016-02-02 04:11:38 -0600 )edit

Now I'm tyring to modify canny's thresholds, using a median method to calculate the median value of the grayscale image and then give low_threshold=median0.66 and hig_threshold=median1.33 .

panc gravatar imagepanc ( 2016-02-02 04:22:22 -0600 )edit

EDIT: for types mismatch I've found a simple solution to pass from Mat to List<point> without the usage of MatOfPoint and .toList() method:

    Mat pts=new Mat();
    List<Point> pts_ref=new ArrayList<>();
    Converters.Mat_to_vector_Point(pts,pts_ref);
panc gravatar imagepanc ( 2016-02-02 05:30:45 -0600 )edit

Question Tools

2 followers

Stats

Asked: 2016-01-29 08:41:00 -0600

Seen: 6,479 times

Last updated: Feb 02 '16