weight values for the features. #96

norci · 2019-05-24T09:37:13Z

I want to assign weight values for the features.
that the split function should use the features with larger weight first.

For some of the features are more important than the others, and I want to control the split process.

bensadeghi · 2019-06-10T07:13:25Z

The split routines already identify which features have the most predictive power (information gain) via Shannon entropy. So IMO, manually identifying/defining which features are of high importance is unnecessary, and I don't know of any DT implementation out there supporting this capability.
But if you have an implementation in mind for this, we'd be happy to consider it.

norci · 2019-06-18T15:10:45Z

My reason for a weighted features:

My application is using a decision tree, it's output is a function. This function will be called in the subsequent code, so there are special rules for the functions.
That is, rule 1 for some functions, rule 2 for some other functions, ...

The problem is, the decision tree model is not able to learn these rules.
I guess it's caused by the amount of data are not the same for these rules.

I did not modify the code in DecisionTree.jl, but used a simple workaround.
steps:

draw a flow chart, list the nodes which need to be created explicitly.
create logic rules for each node. such as (Feature A >0.5 && B <100).
create sub datasets, by filter the dataset with above logic rules,
build trees for each sub dataset.
create the entire tree by calling "Node{Float64,String}()" recursively.

I feel this is an ugly solution.

I'm not a ML expert. Could you tell me if this feature is valuable? I mean will anyone else use this feature?
If so, I'll implement it in DecisionTree.jl, when I have free time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

weight values for the features. #96

weight values for the features. #96

norci commented May 24, 2019

bensadeghi commented Jun 10, 2019

norci commented Jun 18, 2019

weight values for the features. #96

weight values for the features. #96

Comments

norci commented May 24, 2019

bensadeghi commented Jun 10, 2019

norci commented Jun 18, 2019