1 | initial version |
Hi! One day I caught this. It's easy :)
In case of a categorical variable a tree split is a bitmap "subset". This mask determines which category of a split variable (ie which samples) has to go to the left child node (the direction -1) and to the right one (the direction +1). The macro is used to compute the direction for a given category of variable.
In the implementation the bitmap "subset" is an array of 'int'. "idx" is a given category.
(idx)>>5 - it's equivalent to division by sizeof(int), ie we find the index of element of the array "subset" that contains a bit for the given category;
(idx) & 31) - it's the remainder of dividing by sizeof(int). Here we find the index of a category bit in the array element.
1 << ((idx) & 31) - it gives a map filled with zeros and having one "1" in the required position.
(subset[(idx)>>5]&(1 << ((idx) & 31)))==0 - here we check the bit value for the given category.
(2*((subset[(idx)>>5]&(1 << ((idx) & 31)))==0)-1) - if the value is 0 we get direction -1, otherwise +1.