Skip to content

Commit

Permalink
Merge pull request #56 from imbs-hl/fix_regression
Browse files Browse the repository at this point in the history
Check for pure nodes in regression trees
  • Loading branch information
mnwright committed Apr 29, 2016
2 parents ea95dc6 + be88a57 commit d980670
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 4 deletions.
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
##### Version 0.4.1
* Runtime improvement for regression forests on classification data

##### Version 0.4.0
* New CRAN version. New CRAN versions will be 0.x.0, development versions 0.x.y

Expand Down
4 changes: 2 additions & 2 deletions ranger-r-package/ranger/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
Package: ranger
Type: Package
Title: A Fast Implementation of Random Forests
Version: 0.4.0
Date: 2016-04-07
Version: 0.4.1
Date: 2016-04-28
Author: Marvin N. Wright
Maintainer: Marvin N. Wright <[email protected]>
Description: A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles
Expand Down
3 changes: 3 additions & 0 deletions ranger-r-package/ranger/NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
##### Version 0.4.1
* Runtime improvement for regression forests on classification data

##### Version 0.4.0
* Reduce memory usage of savest forest objects (changed child.nodeIDs interface)
* Add keep.inbag option to track in-bag counts
Expand Down
18 changes: 17 additions & 1 deletion source/src/Tree/TreeRegression.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,23 @@ bool TreeRegression::splitNodeInternal(size_t nodeID, std::vector<size_t>& possi
return true;
}

// Find best split, stop if no decrease of impurity
// Check if node is pure and set split_value to estimate and stop if pure
bool pure = true;
double pure_value = 0;
for (size_t i = 0; i < sampleIDs[nodeID].size(); ++i) {
double value = data->get(sampleIDs[nodeID][i], dependent_varID);
if (i != 0 && value != pure_value) {
pure = false;
break;
}
pure_value = value;
}
if (pure) {
split_values[nodeID] = pure_value;
return true;
}

// Find best split, stop if no decrease of impurity
bool stop = findBestSplit(nodeID, possible_split_varIDs);
if (stop) {
split_values[nodeID] = estimate(nodeID);
Expand Down
2 changes: 1 addition & 1 deletion source/src/version.h
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#ifndef RANGER_VERSION
#define RANGER_VERSION "0.4.0"
#define RANGER_VERSION "0.4.1"
#endif

0 comments on commit d980670

Please sign in to comment.