yashasvini121 · SimranShaikh20 · Nov 8, 2024 · Nov 8, 2024 · Nov 8, 2024
diff --git a/models/BitcoinPricePrediction/README.md b/models/BitcoinPricePrediction/README.md
@@ -0,0 +1,38 @@
+## Bitcoin Price Prediction using LSTM
+This repository contains an implementation of a Long Short-Term Memory (LSTM) model for predicting Bitcoin prices using historical data. The model is built using Keras and TensorFlow. It predicts the closing price of Bitcoin by training on past data and evaluating its performance on a test set. The results are evaluated using common regression metrics such as R², RMSE, MAE, and others.
+
+## **Dataset**
+The dataset used in this project is historical Bitcoin price data downloaded from a public source. The file BTC-USD.csv contains columns such as Date, Open, High, Low, Close, Adj Close, and Volume. The prediction is based solely on the Close price of Bitcoin.
+
+**Dataset Preprocessing:**
+* Missing values are handled.
+* The Date column is converted into datetime format and set as the index.
+* The closing prices are normalized using MinMaxScaler for better model performance.
+
+## Model Architecture
+The implemented model is a multi-layer LSTM neural network that includes dropout layers to reduce overfitting. Here's an overview of the model:
+
+**Input Layer:** Time series data reshaped to 3D for LSTM layers.
+**LSTM Layers:** Two LSTM layers with 200 and 160 units, respectively, with return_sequences enabled in the first layer.
+**Dropout:** Added after each LSTM layer to prevent overfitting.
+**Dense Layers:** Two Dense layers; the final layer has 1 neuron for regression output (predicted closing price).
+
+## Model Hyperparameters
+**Batch Size:** 32
+**Epochs:** 50
+**Loss Function:** Mean Squared Error (MSE)
+**Optimizer:** Adam
+**Metrics:** Mean Absolute Percentage Error (MAPE)
+
+## Evaluation Metrics
+The model is evaluated using several regression metrics, including:
+
+* **R² Score:** Measures the proportion of variance in the dependent variable that is predictable.
+* **RMSE:** Root Mean Squared Error, used to measure the differences between predicted and observed values.
+* **MSE:** Mean Squared Error, similar to RMSE but without square rooting.
+* **MAE:** Mean Absolute Error, the average of the absolute errors between actual and predicted values.
+* **MATE:** Median Absolute Error, a robust measure of error.
+* **SMATE:** Scaled Mean Absolute Error, normalized version of MAE.
+
+  A detailed comparison of the training and testing set metrics is included in the project.
+
diff --git a/models/BitcoinPricePrediction/data/BTC-USD.csv b/models/BitcoinPricePrediction/data/BTC-USD.csv
diff --git a/models/BitcoinPricePrediction/model.py b/models/BitcoinPricePrediction/model.py
@@ -0,0 +1,36 @@
+import pandas as pd
+from sklearn.model_selection import train_test_split
+from sklearn.ensemble import RandomForestClassifier
+from sklearn.preprocessing import StandardScaler
+import joblib
+
+class LoanApprovalModel:
+    def __init__(self):
+        self.model = RandomForestClassifier()
+        self.scaler = StandardScaler()
+
+    def load_data(self, filepath):
+        data = pd.read_csv(filepath)
+        return data
+
+    def preprocess_data(self, data):
+        # Perform preprocessing steps here
+        X = data.drop('target', axis=1)
+        y = data['target']
+        return train_test_split(X, y, test_size=0.2, random_state=42)
+
+    def train(self, X_train, y_train):
+        self.scaler.fit(X_train)
+        X_train_scaled = self.scaler.transform(X_train)
+        self.model.fit(X_train_scaled, y_train)
+
+    def save_model(self, model_path, scaler_path):
+        joblib.dump(self.model, model_path)
+        joblib.dump(self.scaler, scaler_path)
+
+if __name__ == "__main__":
+    loan_model = LoanApprovalModel()
+    data = loan_model.load_data('data/loan_data.csv')  # Example path
+    X_train, X_test, y_train, y_test = loan_model.preprocess_data(data)
+    loan_model.train(X_train, y_train)
+    loan_model.save_model('saved_models/model.pkl', 'saved_models/scaler.pkl')
diff --git a/models/BitcoinPricePrediction/notebooks/Bitcoin Price Prediction LSTM-checkpoint.ipynb b/models/BitcoinPricePrediction/notebooks/Bitcoin Price Prediction LSTM-checkpoint.ipynb
diff --git a/models/BitcoinPricePrediction/notebooks/Bitcoin Price Prediction LSTM.ipynb b/models/BitcoinPricePrediction/notebooks/Bitcoin Price Prediction LSTM.ipynb
diff --git a/models/BitcoinPricePrediction/predict.py b/models/BitcoinPricePrediction/predict.py
@@ -0,0 +1,18 @@
+import joblib
+import pandas as pd
+
+class LoanApprovalPredictor:
+    def __init__(self, model_path, scaler_path):
+        self.model = joblib.load(model_path)
+        self.scaler = joblib.load(scaler_path)
+
+    def predict(self, input_data):
+        input_data_scaled = self.scaler.transform(input_data)
+        return self.model.predict(input_data_scaled)
+
+if __name__ == "__main__":
+    predictor = LoanApprovalPredictor('saved_models/model.pkl', 'saved_models/scaler.pkl')
+    # Example input data, replace with actual data
+    input_data = pd.DataFrame([[...]], columns=[...])  # Replace with actual column names
+    predictions = predictor.predict(input_data)
+    print(predictions)
diff --git a/models/BitcoinPricePrediction/saved_model/bitcoin_price_prediction_lstm.py b/models/BitcoinPricePrediction/saved_model/bitcoin_price_prediction_lstm.py
@@ -0,0 +1,213 @@
+# -*- coding: utf-8 -*-
+"""Bitcoin Price Prediction LSTM.ipynb
+
+Automatically generated by Colab.
+
+Original file is located at
+    https://colab.research.google.com/drive/1kinADIkfmyvxsBWbJpmNRlJSqLCH1MgE
+
+# Implementation of LSTM on Bitcoin Dataset
+"""
+
+import pandas as pd
+import numpy as np
+import matplotlib.pyplot as plt
+from sklearn.preprocessing import MinMaxScaler
+from keras.models import Sequential
+from keras.layers import Dense, LSTM, Dropout
+from sklearn.metrics import mean_squared_error, mean_absolute_error, median_absolute_error,r2_score
+import warnings
+import seaborn as sns
+import tensorflow as tf
+
+warnings.filterwarnings("ignore")
+
+# Load the data
+BTC = pd.read_csv("BTC-USD.csv")
+
+print(BTC.columns)
+
+# Convert the 'Date' column to datetime and set it as the index
+BTC['Date'] = pd.to_datetime(BTC['Date'])
+BTC.set_index('Date', inplace=True)
+
+BTC.head()
+
+#Checking null value
+print(BTC.isnull().sum())
+
+# Plot the closing prices
+plt.figure(figsize=(10, 6))
+plt.plot(BTC['Close'], label='BTC Close')
+plt.title('BTC Closing Prices')
+plt.xlabel('Date')
+plt.ylabel('Closing Price')
+plt.legend()
+plt.show()
+
+# Time Series Scatter Plot
+plt.figure(figsize=(10, 6))
+sns.scatterplot(x=BTC.index, y=nifty_50_df['Close'])
+plt.title('Time Series Scatter Plot of BTC Closing Prices')
+plt.xlabel('Date')
+plt.ylabel('Closing Price')
+plt.show()
+
+# Prepare the data for modeling
+data = BTC[['Close']].values
+scaler = MinMaxScaler(feature_range=(0, 1))
+scaled_data = scaler.fit_transform(data)
+
+# Split the data into training and testing sets
+train_size = int(len(scaled_data) * 0.7)
+train_data, test_data = scaled_data[:train_size], scaled_data[train_size:]
+
+# Create a function to create datasets for training and testing
+def create_dataset(dataset, time_step):
+    X, Y = [], []
+    for i in range(len(dataset) - time_step - 1):
+        a = dataset[i:(i + time_step), 0]
+        X.append(a)
+        Y.append(dataset[i + time_step, 0])
+    return np.array(X), np.array(Y)
+
+# Create the training and testing datasets
+time_step = 50
+X_train, y_train = create_dataset(train_data, time_step)
+X_test, y_test = create_dataset(test_data, time_step)
+
+# Reshape the data for GRU layers
+X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
+X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
+
+# Create the GRU model
+model = Sequential()
+model.add(LSTM(200, return_sequences=True, input_shape=(time_step, 1)))
+model.add(Dropout(0.4))
+model.add(LSTM(160, return_sequences=False))
+model.add(Dense(50))
+model.add(Dense(1))
+
+# Compile the model
+model.compile(optimizer='adam', loss='mean_squared_error', metrics=[tf.keras.metrics.MeanAbsolutePercentageError()])
+
+# Train the model
+history = model.fit(X_train, y_train, batch_size=32, epochs=50, validation_data=(X_test, y_test))
+
+# Make predictions
+train_predict = model.predict(X_train)
+test_predict = model.predict(X_test)
+
+# Inverse transform the predictions
+train_predict = scaler.inverse_transform(train_predict)
+test_predict = scaler.inverse_transform(test_predict)
+
+# Inverse transform the original values
+original_y_train = scaler.inverse_transform([y_train])
+original_y_test = scaler.inverse_transform([y_test])
+
+# Create plots for the predicted values
+train_predict_plot = np.empty_like(scaled_data)
+train_predict_plot[:, :] = np.nan
+train_predict_plot[time_step:len(train_predict) + time_step, :] = train_predict
+
+test_predict_plot = np.empty_like(scaled_data)
+test_predict_plot[:, :] = np.nan
+test_predict_plot[len(train_predict) + (time_step * 2) + 1:len(scaled_data) - 1, :] = test_predict
+
+plt.figure(figsize=(10, 6))
+plt.plot(scaler.inverse_transform(scaled_data), label='Actual')
+plt.plot(train_predict_plot, label='Train Predict')
+plt.plot(test_predict_plot, label='Test Predict')
+plt.title('Actual vs Predicted Values')
+plt.xlabel('Date')
+plt.ylabel('Closing Price')
+plt.legend()
+plt.show()
+
+# Plot the model loss and MAPE over epochs
+plt.figure(figsize=(12, 6))
+
+# Plot Loss
+plt.subplot(1, 2, 1)
+plt.plot(history.history['loss'], label='Training Loss')
+plt.plot(history.history['val_loss'], label='Validation Loss')
+plt.title('Model Loss (MSE) Over Epochs')
+plt.xlabel('Epochs')
+plt.ylabel('Loss (MSE)')
+plt.legend()
+
+# Plot MAPE
+plt.subplot(1, 2, 2)
+plt.plot(history.history['mean_absolute_percentage_error'], label='Training MAPE')
+plt.plot(history.history['val_mean_absolute_percentage_error'], label='Validation MAPE')
+plt.title('Model Accuracy (MAPE) Over Epochs')
+plt.xlabel('Epochs')
+plt.ylabel('MAPE (%)')
+plt.legend()
+
+plt.tight_layout()
+plt.show()
+
+# Plot true vs predicted residuals
+plt.figure(figsize=(12, 6))
+plt.plot(scaler.inverse_transform(scaled_data), label="True")
+plt.plot(test_predict_plot, label="Test Predicted")
+plt.title("True vs Predicted BTC   Close Prices")
+plt.legend()
+plt.show()
+
+# Calculate R² score, rmse, mae, mse, mate, smate for training and testing sets
+train_r2 = r2_score(original_y_train[0], train_predict[:, 0])
+test_r2 = r2_score(original_y_test[0], test_predict[:, 0])
+
+train_rmse = np.sqrt(mean_squared_error(original_y_train[0], train_predict[:, 0]))
+test_rmse = np.sqrt(mean_squared_error(original_y_test[0], test_predict[:, 0]))
+
+train_mae = mean_absolute_error(original_y_train[0], train_predict[:, 0])
+test_mae = mean_absolute_error(original_y_test[0], test_predict[:, 0])
+
+train_mse = mean_squared_error(original_y_train[0], train_predict[:, 0])
+test_mse = mean_squared_error(original_y_test[0], test_predict[:, 0])
+
+train_mate = median_absolute_error(original_y_train[0], train_predict[:, 0])
+test_mate = median_absolute_error(original_y_test[0], test_predict[:, 0])
+
+train_smate = np.sqrt(mean_squared_error(original_y_train[0], train_predict[:, 0])) / np.mean(original_y_train)
+test_smate = np.sqrt(mean_squared_error(original_y_test[0], test_predict[:, 0])) / np.mean(original_y_test)
+
+print(f'Train R²: {train_r2}')
+print(f'Test R²: {test_r2}')
+
+print("Training RMSE: ", train_rmse)
+print("Testing RMSE: ", test_rmse)
+
+print("Training MAE: ", train_mae)
+print("Testing MAE: ", test_mae)
+
+print("Training MSE: ", train_mse)
+print("Testing MSE: ", test_mse)
+
+print("Training MATE: ", train_mate)
+print("Testing MATE: ", test_mate)
+
+print("Training SMATE: ", train_smate)
+print("Testing SMATE: ", test_smate)
+
+from tabulate import tabulate
+import numpy as np
+# Create a table
+table = [
+    ["Metric", "Training", "Testing"],
+    ["R² Score", f"{train_r2:.4f}", f"{test_r2:.4f}"],
+    ["RMSE", f"{train_rmse:.4f}", f"{test_rmse:.4f}"],
+    ["MSE", f"{train_mse:.4f}", f"{test_mse:.4f}"],
+    ["MAE", f"{train_mae:.4f}", f"{test_mae:.4f}"],
+    ["MATE", f"{train_mate:.4f}", f"{test_mate:.4f}"],
+    ["SMATE", f"{train_smate:.4f}", f"{test_smate:.4f}"]
+]
+
+print(tabulate(table, headers="firstrow", tablefmt="grid"))
+
+
+