You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The split routines already identify which features have the most predictive power (information gain) via Shannon entropy. So IMO, manually identifying/defining which features are of high importance is unnecessary, and I don't know of any DT implementation out there supporting this capability.
But if you have an implementation in mind for this, we'd be happy to consider it.
My application is using a decision tree, it's output is a function. This function will be called in the subsequent code, so there are special rules for the functions.
That is, rule 1 for some functions, rule 2 for some other functions, ...
The problem is, the decision tree model is not able to learn these rules.
I guess it's caused by the amount of data are not the same for these rules.
I did not modify the code in DecisionTree.jl, but used a simple workaround.
steps:
draw a flow chart, list the nodes which need to be created explicitly.
create logic rules for each node. such as (Feature A >0.5 && B <100).
create sub datasets, by filter the dataset with above logic rules,
build trees for each sub dataset.
create the entire tree by calling "Node{Float64,String}()" recursively.
I feel this is an ugly solution.
I'm not a ML expert. Could you tell me if this feature is valuable? I mean will anyone else use this feature?
If so, I'll implement it in DecisionTree.jl, when I have free time.
I want to assign weight values for the features.
that the split function should use the features with larger weight first.
For some of the features are more important than the others, and I want to control the split process.
The text was updated successfully, but these errors were encountered: