Skip to content

Latest commit

 

History

History
29 lines (24 loc) · 1.68 KB

File metadata and controls

29 lines (24 loc) · 1.68 KB

Supervised Machine Learning Projects

Final Project

• The Fire-Risk Assessment project is developed by the National Park Service's (NPS) Fire and Aviation Management program to respond to the devastating 2011 wildfire season and holds data from 1970-2020. • The dataset can be found at https://data-nifc.opendata.arcgis.com/datasets/facility.

• The aim of the project is to build a predictive model that evaluates certain contributing factors such as access to the facility, the surrounding environments, construction design, and materials and resources available to protect facilities from wildland fire.

• The data set was procured from: https://data-nifc.opendata.arcgis.com/datasets/facility?geometry=50.977%2C-89.991%2C-50.977%2C-89.336 (last accessed on 10 Oct 2020). An overview of the dataset has been given in table 1. • We divided the dataset into nine different tables in the SQL database.

PROPOSED ANALYSIS: • Performing EDA (.ipynb notebook) • Data Cleaning • Suitable Data Imputation • Dividing the data into train and test sets • Feature importance analysis • ML algorithms • Summary CONCLUSION: ● Ensemble methods like random Forest and XGBoost are performing well in predicting ‘Rating’ variable against Wild Fire hazards! ● Sigmoid kernel from SVM is not performing in a desirable manner. ● Suitable ML algorithms would be depending on dataset. No Hands-on rule!!

CONCLUSION: ● Ensemble methods like random Forest and XGBoost are performing well in predicting ‘Rating’ variable against Wild Fire hazards! ● Sigmoid kernel from SVM is not performing in a desirable manner. ● Suitable ML algorithms would be depending on dataset. No Hands-on rule!!