For DecisionVariableKind=Continuous, candidates for cut values are determined by this code:
if (o[j] != o[j + 1])
candidates.Add((v[j] + v[j + 1]) / 2.0);
The order of the output values are determined by the sorting order of the actual values thus can change for several equal values.
For example, after sorting:
Values are: 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30
Labels are: 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1
Candidates for split in this case are: 2, 2, 2, 3, 3, 3
Another option for sorting would be:
Values are: 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30
Labels are: 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1
(I've just switched the order of two equal values)
In that case, although the values are exactly the same, the candidates for split would be: 2, 2, 2, 3, 3, 3, 3, 13.5.
This behavior may results in the same decision tree no matter if the values are:
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 24, 24, 26, 26, 28, 29, 30, 30 OR
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 240, 240, 260, 260, 280, 290, 300, 300 OR
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 2400, 2400, 2600, 2600, 2800, 2900, 3000, 3000 OR..
Since both the last 3 and the 24 are labeled as 1 (other 3's are labeled as 0)
Any suggestions on what can be done?
Thanks a lot,
Sivan.