Project of Udacity Data Scientist Nanodegree Program
Required packeages are listed in requirement.txt
.
Follow follows steps to run the app:
-
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
- To run ML pipeline that trains classifier and saves
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To run ETL pipeline that cleans data and stores in database
-
Run the following command in the app's directory to run your web app.
python run.py
-
Go to http://0.0.0.0:3001/
In this project, the process of a comprehensive implementation of Machine Learning in realworld project is demonstrated, which includes following steps:
- ETL process includes extracting data, cleanning data and storing the clean data into a SQLite database.
- Using NLP, Pipeline and GridSearchCV to classificate data.
- Deployment the model as a web app
There are 3 directories here.
- Directory
app
contains the script to start the web apprun.py
and the webpage templates in subdirectorytemplates
- Directory
data
contains the origin datadisaster_categories.csv
anddisaster_messages.csv
, the ETL scriptprocess_data.py
and the databaseDisasterResponse.db
which saves the cleaned data. - Directory
models
saves the ML scripttrain_classifier.py
and the saved trained ML modelfinal_model.py
.
It should be pointed out, there is still much room for imporvement. An obvious problem is the data is imbalanced, which has stongly influenced the accuracy and precison of the trained model.
Another improvement is to employ the model in a cloud server rather than locally.
Finally, due to the restriction of computation power, GridSearchCV
here is only to demonstrate the pipeline to employ it rather than to provide optimized trained results.
Must give credit to Udacity for the project. Otherwise, feel free to use the code here as you would like!