This project aims to predict equipment failures in healthcare facilities by analyzing various operational metrics like usage hours, temperature, vibration levels, and maintenance history. Using machine learning models, this project provides a way to implement predictive maintenance strategies, reducing downtime and maintenance costs.
The dataset used for this project is healthcare_equipment_data.csv
(or the Excel version healthcare_equipment_data.xlsx
). It contains the following features:
- equipment_id: Unique ID for each equipment.
- usage_hours: Number of hours the equipment has been used.
- temperature: Operating temperature of the equipment (in degrees).
- vibration_level: Vibration intensity level of the equipment.
- pressure_level: Operating pressure level.
- last_maintenance: Hours since the last maintenance.
- failure: Binary target variable indicating if a failure occurred (1 = Yes, 0 = No).
- notebook.ipynb: Jupyter notebook containing the complete code for data preprocessing, visualization, model building, and evaluation.
- healthcare_equipment_data.csv: The CSV file containing the dataset (mock data provided).
- healthcare_equipment_data.xlsx: The Excel version of the dataset (optional).
- README.md: This file, providing an overview of the project.
- The notebook begins by importing necessary libraries like
pandas
,numpy
,matplotlib
,seaborn
, and machine learning tools fromsklearn
.
- The dataset is loaded using
pd.read_csv()
orpd.read_excel()
. - Initial data exploration includes checking the structure of the dataset (
.info()
,.describe()
), handling missing values, and visualizing feature distributions.
- Histograms and correlation heatmaps are generated using
matplotlib
andseaborn
to explore feature relationships.
- Handling missing values, encoding categorical features (if applicable), and splitting the dataset into training and testing sets.
- Features are standardized using
StandardScaler()
to normalize data for machine learning models.
- A Random Forest Classifier is used for the predictive task.
- The model is trained on the training set, and predictions are made on the test set.
- The model is evaluated using a confusion matrix, accuracy score, and a classification report. These metrics help assess the model's performance in predicting equipment failures.
- Feature importance is plotted to understand the impact of each feature on the prediction of failures.
- A summary of the model's performance is provided, along with insights into which factors are most critical in predicting equipment failures.
-
Clone the repository or download the files.
-
Install dependencies: Ensure that you have the required Python libraries installed. You can install them using the following command:
pip install pandas numpy matplotlib seaborn scikit-learn
-
Run the Jupyter Notebook:
- Open the
notebook.ipynb
file in a Jupyter environment and run the cells step-by-step. - You can use either the CSV or Excel version of the dataset by changing the appropriate file path in the
pd.read_csv()
orpd.read_excel()
functions.
- Open the
- Python 3.x
- Jupyter Notebook
- Libraries:
pandas
,numpy
,matplotlib
,seaborn
,scikit-learn
Predictive maintenance is crucial for reducing equipment downtime and improving reliability in healthcare settings. By analyzing operational data, this project demonstrates how machine learning can be used to predict equipment failures, allowing for more efficient maintenance scheduling.