Skip to content

Load an Excel file and process a random forest classifcation based on the user selected features and label

License

Notifications You must be signed in to change notification settings

dobedobedo/Random_Forest_Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Random Forest Classification

Load an Excel file and process a random forest classifcation based on the user selected features and label

Before executing the script, the user need to modify the inputfile variable to the desired Excel spreadsheet. The default reading mode will take the first row as the header and the first column as the index to a pandas dataframe.

Once the data is successfully loaded, it will prompt the user to select one or more features for classification, then one feature as the class label. This random forest nodes will grow parallel and randomly pick 80% of the samples for training. The warm start is enabled for the classifier to adjust the nodes during 300 iteration times. The weights for different classes will be adjusted in case the input labels are imbalanced. Once the classifier is trained completely to achieve a stable out-of-bag error, the classifier will be apply to all samples for classification.

The output will be two figures. The first one is the confusion matrix of the classification results with the accuracy annotated. The second figure is the change of out-of-bag accuracy durint the iteration.

Dependencies: numpy, pandas, scikit-learn, and matplotlib.

About

Load an Excel file and process a random forest classifcation based on the user selected features and label

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages