I have a leaf dataset that I will use to build a model in classifying the disease. However, i have to separate the foreground from the background and retain the leaf only. Can someone please help me with this? thanks.
I have a leaf dataset that I will use to build a model in classifying the disease. However, i have to separate the foreground from the background and retain the leaf only. Can someone please help me with this? thanks.
The only help I can give you is an advice: learn OpenCV and image processing in general (and of course coding).
Disease classification is a f**ckin' hard problem. And believe me, I know what I'm talking about.
If you cannot do a foreground/background classification with the image of a leaf (which is a very simple problem), there is no reason for you to continue.
Thanks for the advice. I have done a few pre-processing techniques and still have 50 days to submit the entire project. The proposal has already been done and I would not be able to change it now :(
Hmmm... While you can do something in 50 days that kinda works and gives some results for images taken in controlled conditions and specific diseases, this is an unsolved problem in agricultural imaging. This means that research teams with years of experience, lot of resources couldn't give a definite solution.
The issue is very important, a robust solution could lead to selective spraying and an important reduction of pesticide usage in agriculture...
Thank you for your honest feedback. I have the official dataset from Plantvillage (images taken under controlled environment) and my project is not going to be extensive; it is about how accurately my chosen image preprocessing techniques along with a machine learning algorithm (preferebly ANN or CNN) can classify one among the four classes (healthy leaves, apple rust, scab, and rot). I hope this is something that can be achieved within the mentioned timeline :(
Well, you can try to input those images directly into a CNN (ANN won't work) as a training database, so no presegmentation (or need for OpenCV) is needed, only a deep learning framework (Keras, Tensorflow, caffe)...
...BUT...
CNN need a lot of training data (tens of thousand samples per class). And this dataset contains only hundreds. You can play with data augmentation techniques, but I'm still not sure if it's enough...
Neural network is not the only technique I am open to, I was also thinking of using a SVM model for this multi class classification project, would you think it might be a better fit for this? Thank you.
First I would try a CNN. As the frameworks and networks are available, testing it won't take a lot of time. It might give acceptable results for a 2 month project albeit the small database size.
For a more serious approach and for this database size I would choose an SVM on texture descriptors+RGB data. However it's much more difficult to implement, especially for a beginner.
I have done histogram equalisation and applied gaussian filtering on the images, not sure if they might improve the training of the CNN, I shall give it a try and let you know what happens to it, i appreciate your suggestions and guidance sir, thank you for the support, truly grateful.
The performance of a CNN depends mainly on the size of the training dataset!!! If the dataset is too small, it will do an overfitting instead of generalizing. So equalization or other type of filtering won't enhance the performance of the classification. See this answer for explanation.
where is the leaf dataset?