You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a data set of dimensions (87390, 243). Most of the columns are categorical variables that have been one-hot encoded. The size of the data set in memory is ~160 MB. I compared the memory usage for DecisionTree.jl and R's ranger package.
Thus, it appears that DecisionTree.jl is using 2.4x as much memory as ranger for this model. Is it possible to reduce the memory footprint of DecisionTree.jl? I can provide a scrubbed version of my data set if that helps.
The text was updated successfully, but these errors were encountered:
You could cast the features to a concrete type (ie X = Int.(X)) as opposed to using the Any type, which is quite heavy. That should help a little bit.
But otherwise, we need a new implementation of the Leaf type (see #90), which requires a significant amount of work.
I have a data set of dimensions
(87390, 243)
. Most of the columns are categorical variables that have been one-hot encoded. The size of the data set in memory is ~160 MB. I compared the memory usage for DecisionTree.jl and R's ranger package.DecisionTree.jl
Memory consumption:
ranger
Memory consumption:
Conclusion
Thus, it appears that DecisionTree.jl is using 2.4x as much memory as ranger for this model. Is it possible to reduce the memory footprint of DecisionTree.jl? I can provide a scrubbed version of my data set if that helps.
The text was updated successfully, but these errors were encountered: