Using PCA and LDA for dimensionality reduction for SVM
I am preparing data for training an SVM. I use PCA to reduce dimensionality of data before using LDA for class discriminant dimensionality reduction. I then feed reduced data projected into LDA subspace to SVM as shown in code below.
Mat trainData; //Hold data for training. Each row is a sample
vector<Mat> histograms; //Contains row histograms of LBP features
convertVectorToMat(histograms, trainData); //Convert vector of Mat to Mat (40 rows, 4096 columns)
PCA pca(trainData, Mat(), PCA::DATA_AS_ROW, (classes - 1));//PCA gives (40 rows, 39 columns)
Mat mean = pca.mean.reshape(1, 1);
//Project data to PCA feature space
Mat projection = pca.project(trainData);
//Perform LDA on data projected on PCA feature space
LDA lda((classes - 1));
lda.compute(projection, labels);
Mat_<float> ldaProjected = lda.project(projection);
normalize(ldaProjected, ldaProjected, 0, 1, NORM_MINMAX, CV_32FC1);
I am passing Mat ldaProjected
to SVM together with corresponding labels for training. My question is am I doing it right or I should have passed Mat projection
to SVM. In whichever case, SVM is only giving same class label for any sample I predict. Kindly advice if am preparing my data well for training. I intended to use LDA for dimensionality reduction for training multi-class SVM.
technically, your code is correct. but with some more filters in the pipeline now, you probably have to adjust your SVM params.
This is how am training SVM
Kindly advice me if am setting SVM params well. I intent to perform classification of images from 40 different classes
since you're using C_SVC, try:
with C = 0.1, 1, 10, 100, 500, 1000
also,
I have tried with both methods still not fine. Could I be preparing my data wrongly? Because when I look at my trained model, I have got 3304 Support Vectors but all populated with 0. I am extracting LBP features from images then push LBP Histograms to a vector of
Mat
. Then I am converting thisvector<Mat>
histograms toMat trainData
as shown in the code below. Kindly advice if my logic is OKThis is how am Converting from
vector<Mat> histogram
toMat trainData
Even in my LDA Model, eigenvectors consist of so many 0s and few 1s only, no any other value in the XML file. Kindly advice on that logic of converting from
vector<Mat>
toMat
and generally how to prepare a training matrix. Thank youagain imho, "technically", your code is ok.
no idea, but somehow, i'd try first without the PCA/LDA. also with more data, 40 train items might just not be enough.
ohhh, wait, that means, you only got 1 sample per class ? that's just bad.
again, needs mode data, imho.
Thank you berak. I have more samples per class (Atleast 4 and atmost 7). Should I use same number of images per class? For 40 classes, I have a total of 220 images am using to train SVM.
it won't matter much if you have 5 for one class, and 7 for another. just try to keep it halfway balanced, and use as many as you can.