NYC Airbnb Review Prediction

This project aims to predict the number of reviews for Airbnb listings in New York City using various machine learning models, including Linear Regression, Decision Trees, and Random Forest.

Introduction

The purpose of this project is to build machine learning models to predict the number of reviews for Airbnb listings in New York City. Accurate predictions can help hosts and property managers optimize their listings and improve customer satisfaction.

Dataset

The dataset used in this project is the New York City Airbnb Open Data. It contains detailed information about Airbnb listings in NYC, including the number of reviews, price, location, and other relevant features.

Data Preprocessing

The data preprocessing steps include:

Handling missing values
Encoding categorical variables
Normalizing numerical features
Splitting the data into training and testing sets

Models

Linear Regression

Linear Regression is a simple and interpretable model that attempts to predict the target variable by fitting a linear relationship between the input features and the target.

Decision Trees

Decision Trees model the data by splitting it into subsets based on the value of input features, forming a tree-like structure. This model is easy to interpret and can capture non-linear relationships.

Random Forest

Random Forest is an ensemble method that builds multiple decision trees and combines their predictions. This approach improves the model's accuracy and reduces the risk of overfitting.

Results

The performance of each model is evaluated using metrics such as Root Mean Squared Error (RMSE) and R-squared (R²). Here are the results:

Linear Regression:
- RMSE: 43.37
- R²: 0.300
Decision Trees:
- RMSE: 28.18
- R²: 0.705
Random Forest:
- RMSE: 23.08
- R²: 0.802

Conclusion

The Random Forest model performed the best among the three models, demonstrating the highest accuracy in predicting the number of reviews. However, each model has its own strengths and can be chosen based on specific requirements.

What I Learned

Data preprocessing techniques are crucial for preparing the dataset for modeling.
The importance of evaluating different models to find the best one for the task.
The trade-offs between model complexity and interpretability.

Usage

To use the models in this project, follow these steps:

Clone the repository:

git clone https://github.com/your-username/nyc-airbnb-review-prediction.git

Navigate to project directory:
```
cd NY_airbnb_reviews
```
install the required dependences
Run jupyter notebook to see analysis and model training:
```
jupyter notebook
```

Installation

Ensure you have Python 3.7+ and the following libraries installed: • pandas • numpy • scikit-learn • matplotlib • seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
airbnbListingsData.csv		airbnbListingsData.csv
airbnb_ny.ipynb		airbnb_ny.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NYC Airbnb Review Prediction

Table of Contents

Introduction

Dataset

Data Preprocessing

Models

Linear Regression

Decision Trees

Random Forest

Results

Conclusion

What I Learned

Usage

Installation

About

Releases

Packages

Languages

mainoahmuna/NY_airbnb_reviews

Folders and files

Latest commit

History

Repository files navigation

NYC Airbnb Review Prediction

Table of Contents

Introduction

Dataset

Data Preprocessing

Models

Linear Regression

Decision Trees

Random Forest

Results

Conclusion

What I Learned

Usage

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages