Hi,
Below is the partial code from lda.cpp.
// calculate sums for (int i = 0; i < N; i++) { Mat instance = data.row(i); int classIdx = mapped_labels[i]; add(meanTotal, instance, meanTotal); add(meanClass[classIdx], instance, meanClass[classIdx]); numClass[classIdx]++; } // calculate total mean meanTotal.convertTo(meanTotal, meanTotal.type(), 1.0 / static_cast<double> (N)); // calculate class means for (int i = 0; i < C; i++) { meanClass[i].convertTo(meanClass[i], meanClass[i].type(), 1.0 / static_cast<double> (numClass[i])); } // subtract class means for (int i = 0; i < N; i++) { int classIdx = mapped_labels[i]; Mat instance = data.row(i); subtract(instance, meanClass[classIdx], instance); } // calculate within-classes scatter Mat Sw = Mat::zeros(D, D, data.type()); mulTransposed(data, Sw, true);
As you can see, the lines in bold are the one which calculates the within-class scatter. My doubt is, is this correct? From my understanding, within-class scatter is calculated after finding the difference of class elements with its mean value. But here the mulTransposed is applied to data which is the data samples before finding difference about the mean value. Should it be instance instead of data? Please correct me if I am wrong. I am new to this.
Thanks!