Ask Your Question
1

In which order to supply weights for SVM?

asked 2018-12-05 13:08:57 -0600

dimitris2315 gravatar image

updated 2018-12-05 13:41:47 -0600

I have tried very hard to find some kind of answer or documentation on this and I'm sorry if this is obvious.

I'm using OpenCV's SVM on C++. Let's say I use the labels -1,1,2,4 for the different classes, and I want to use cv::ml::SVM::setClassWeights() to assign weights for these labels (let's say these weights are W-1,W1,W2,W4).

The weights vector (cv::Mat) should have 4 rows and 1 column (4 rows because we use 4 different labels) and contain at each row a weight for one of the labels. But which goes where?

Should I set the weights in this vector [W-1,W1,W2,W4]T? Or maybe set the weights in descenting or ascending order by their value and the algorithm figures it out itself?

Thank you.

edit: For clarity I will put the example here too:

Suppose we have 2 features and 100 training samples and the labels for the training data are -1,1,2,4.

training_mat = cv::Mat::zeros(100, 2, CV_32F);

training_mat.ptr<int>(0)[0] = 56; //this might not be the best way to do this but this came in mind

training_mat.ptr<int>(0)[1] = 76;

training_mat.ptr<int>(1)[0] = 86; //these are data

//etc until filled

labels_mat = cv::Mat::zeros(100, 1, CV_32S);

//note: has 1 row for each row of training_mat

labels_mat.ptr<int>(0)[0] = -1;

labels_mat.ptr<int>(1)[0] = 4;

//etc

weights_mat = cv::Mat::zeros(4,1, CV_32F);

weights_mat.at<float>(0, 0)=?

weights_mat.at<float>(1, 0)=?

weights_mat.at<float>(2, 0)=?

weights_mat.at<float>(3, 0)=?

auto svm = cv::ml::SVM::create();

//... C_SVC

svm->setClassWeight(weights_mat);

//...

I know what the weights for each class have to be (let's say for -1 I want them to be 0.1, 1->0.3, 2->0.4, 4->0.2).

Should I do:

weights_mat.at<float>(0, 0)=0.1

weights_mat.at<float>(1, 0)=0.3

weights_mat.at<float>(2, 0)=0.4

weights_mat.at<float>(3, 0)=0.2

or something else?

edit retag flag offensive close merge delete

Comments

could you come up with a minimal reproducing example code for this ?

(and append that to your question)

berak gravatar imageberak ( 2018-12-05 13:13:29 -0600 )edit

@berak Suppose we have 2 features and 100 training samples and the labels for the training data are -1,1,2,4.

training_mat = cv::Mat::zeros(100, 2, CV_32F);

training_mat.ptr<int>(0)[0] = 56; //this might not be the best way to do this but this came in mind

training_mat.ptr<int>(0)[1] = 76;

training_mat.ptr<int>(1)[0] = 86; //these are data

//etc until filled

labels_mat = cv::Mat::zeros(100, 1, CV_32S);

//note: has 1 row for each row of training_mat

labels_mat.ptr<int>(0)[0] = -1;

labels_mat.ptr<int>(1)[0] = 4;

//etc

weights_mat = cv::Mat::zeros(4,1, CV_32F);

weights_mat.at<float>(0, 0)=?

weights_mat.at<float>(1, 0)=?

weights_mat.at<float>(2, 0)=?

weights_mat.at<float>(3, 0)=?

auto svm = cv::ml::SVM::create();

//... C_SVC

svm->setClassWeight(weights_mat);

//...

dimitris2315 gravatar imagedimitris2315 ( 2018-12-05 13:35:04 -0600 )edit
1

@berak A slight comment for my code. I know what the weights for each class have to be (let's say for -1 I want them to be 0.1, 1->0.3, 2->0.4, 4->0.2).

Should I do:

weights_mat.at<float>(0, 0)=0.1

weights_mat.at<float>(1, 0)=0.3

weights_mat.at<float>(2, 0)=0.4

weights_mat.at<float>(3, 0)=0.2

or something else? Thank you.

dimitris2315 gravatar imagedimitris2315 ( 2018-12-05 13:39:10 -0600 )edit
2

Something is wrong training_mat = cv::Mat::zeros(100, 2, CV_32F); training_mat.ptr<int>(0)[0] = 56; //CV_32F is not int you should use training_mat.at<float>(0,0)=56;

training_mat.at<float>(0,1)=76;

and weight must be set train data create

LBerger gravatar imageLBerger ( 2018-12-05 14:53:38 -0600 )edit
1

@LBerger you are right and it is in my code. I was quoting from memory when I wrote the the previous messages, I don't have a problem with my code running.

As for the question, what you linked actually solves the problem. Thank you! For each train sample I will just have to pass its label's weight. For the sake of curiosity if you happen to know, should cv::ml::SVM::setClassWeights() by generally avoided? Is there is actually an order in which they should be assigned?

dimitris2315 gravatar imagedimitris2315 ( 2018-12-05 19:25:01 -0600 )edit

I haven't got time today to investigate but svm seems to use svm.params member and I cannot find where sampleWeights is copied in svm.params

LBerger gravatar imageLBerger ( 2018-12-06 02:31:46 -0600 )edit

1 answer

Sort by ยป oldest newest most voted
0

answered 2018-12-06 03:23:06 -0600

berak gravatar image

updated 2018-12-06 03:32:02 -0600

there is no problem with setting the weights using

svm->setClassWeights(weights);

at all. while you could also pass them into the TrainData structure, it does not matter, in both cases we have valid params.classWeights in SVMImpl::do_train(). (where those are actually used)

it also does not matter, if you make it 1x4 or 4x1, the only requirements are:

  • 1 dim has to 1
  • the other dim must be numClasses
  • type (of the weights) should be float or double
  • SVM type must be C_SVC

so, everything is all right, and happy coding ;)

edit flag offensive delete link more

Comments

Not sure to understand problem After debugging and reading carefully doc I understand that classWeight is a penality for each class and sampleWeight is the sample weight.

LBerger gravatar imageLBerger ( 2018-12-06 12:11:23 -0600 )edit
1

not sure, if i understand it correctly, either , but it seems it's just "weighting" the C param per class here

berak gravatar imageberak ( 2018-12-06 12:14:08 -0600 )edit
1

I think that sampleWeight is not used in svm

LBerger gravatar imageLBerger ( 2018-12-06 12:18:30 -0600 )edit

hehe, that would be the "elephant in the room" ;)

berak gravatar imageberak ( 2018-12-06 12:19:58 -0600 )edit

Hello guys, OP here. @berak thanks for your time but I already knew those. The question was about in which order should they be supplied.

Like the difference between [0.1,0.3,0.2,0.4] vs [0.4,0.3,0.2,0.1]

dimitris2315 gravatar imagedimitris2315 ( 2018-12-06 13:19:20 -0600 )edit

the 1st entry has to correlate to class 0

but again, is it ever used O_o ?

berak gravatar imageberak ( 2018-12-06 13:29:47 -0600 )edit

@berak this is definitely not the case, because I actually use classes 1,2,3,4 and a vector with 4 entries works (if it started from zero it would have needed 5 entries and it would have crushed).

dimitris2315 gravatar imagedimitris2315 ( 2018-12-06 20:11:36 -0600 )edit
1

class_weights are used here

@dimitris2315 , the samples / classes get sorted here and the weights apply in sorting order of the class ids

(maybe i just suck at explaining it ...)

so the 1st weights entry applies to your smallest class id, and the last to the largest

berak gravatar imageberak ( 2018-12-07 01:40:44 -0600 )edit
1

@berak Great, thank you very much!

dimitris2315 gravatar imagedimitris2315 ( 2018-12-07 10:11:43 -0600 )edit

Question Tools

1 follower

Stats

Asked: 2018-12-05 13:08:57 -0600

Seen: 612 times

Last updated: Dec 06 '18