1 | initial version |
sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.
maybe you should / need do your own csv preprocessing, to handle this better.
categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.
2 | No.2 Revision |
sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.
maybe you should / need do your own csv preprocessing, to handle this better. better, instead of opencv's TrainData
utility.
categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.
3 | No.3 Revision |
sad as it is, none of what you wanted is builtin. all strings are discarded while reading the csv, and replaced with (1 based !) indices, in the order of the appearance.
maybe you should / need do your own csv preprocessing, to handle this better, instead of opencv's TrainData
utility.
categorial values, which do not represent an order (e.g: "cat","autobus","accordeon") should NOT be represented in a single numeric variable (like it is done here, (0,1,2)), but you have to find a suitable embedding for your string set, like "one-hot" encoding them.