The experiments were developed from python notebooks. For each experiment, a notebook was created for each of the used models, that is, for the Cancer Breast dataset, four python notebooks were created, one for each model. The same process was introduced in the Credit Card Dataset.
When running each notebook the results are saved in the result folder for each of the experiments. A CSV file is created with the result of the metrics for each scenario of the experiment (number of features), and the charts are created for each of the metrics.
For each experiment, there is a notebook with the objective of gathering the results obtained for each model and creating reports and charts about the experiment as a whole.
Last but not least important, for any questions, read the README.md file and feel free to open an issue.
The thesis and presentation is presented inside the docs foldes.
git clone https://github.com/miguelpimentel/shap_feature_selection.git
Install the mini conda to create an enviroment, it's a nice solution to handle different versions of libraries and packages:
conda install -y jupyter
Add the code below to an xai.yml file. This add the required packages and versions to each library.
name: xai
dependencies:
- python=3.9
- pip>=19.0
- jupyter
- scikit-learn
- scipy
- pandas
- pandas-datareader
- matplotlib
- pillow
- tqdm
- requests
- h5py
- pyyaml
- flask
- boto3
- pip:
- lime
- shap
The following command create an enviroment with the mentioned packages and libraries
conda env create -f xai.yml -n xai
conda activate xai
conda install nb_conda
python -m ipykernel install --user --name xai --display-name "XAI - Python Enviroment"
The datasets are available at:
You have to set the correct path for each dataset, that means:
-
Add de creditcard.csv file inside the folder credit_card_fraud/dataset/
-
Add de data-te.csv file inside the folder cancer_breast/dataset/.
You can run using using jupyter notebook by the following command inside the project root folder:
jupyter notebook </Experiment>
E.g:
jupyter notebook cancer_breast/cancer-breast-catboost.ipynb
Miguel Pimentel