- Upload or dropdown the Car-Dataset("selling_cars_list".csv" from the link above).
- Then you will see automatically the cleaned and scaled data. Then you are ready to execute the program.
- Click on the "Run KMeans" button to run the algorithm and see the process - how clusters are changing and visualizations from the process.
- After executing the KMeans process, you can save one or more of the results as CSV file on your computer(new generated cluster/category you wish) and perform the next step.
- The Topsis method -> clicking on the left side of the panel to choose the dropdown menu(TOPSIS).
- Now upload or dropdown the saved CSV of your category you saved and perform the Topsis method.
- Set preferences/weight to your search for more detailed and best matches for you.
Car Analysis and Recommendation System using Pandas, Numpy, Matplotlib and Streamlit for UI-visualization.
Performs data preprocessing and clustering on a car dataset. Uses the KMeans algorithm to group cars into 2-6 clusters based on features like year, price, kilometers driven, engine capacity, and horsepower. Implements the elbow method to determine the optimal number of clusters. Visualizes the clustering results using PCA for dimensionality reduction.
Categorizes cars into clusters:
- "Best offers(the best stats)",
- "Balanced cars",
- "Huge sized vans/mini-vans/jeeps/4x4/5+ seats etc...",
- "Fast luxury cars(expensive, but fast and new)",
- "Cheap and old cars(worst offer, worst stats)".
or
- Good - best offers
- Normal - balanced
- Huge sized
- Fast luxury
- Cheap, budget
Implements the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) for car recommendation. Reads the clustered car data from KMeans Clustering Algorithm. Applies the TOPSIS method to rank cars based on multiple criteria:
- year
- price
- kilometers driven
- engine
- capacity
Provides a function to print the top-ranked cars with their details.
The project aims to analyze a dataset of cars, group them into meaningful clusters, and then provide recommendations for the best cars within a specific cluster based on multiple criteria. This system could be useful for car dealerships or consumers looking for specific types of vehicles that best match their preferences.
analyze_that_car/
├── ui/
│ ├── ui_kmeans_cluster_analyze.py
│ └── ui_topsis_search.py
│
├── utils/
│ ├── __init__.py
│ └── data_processing.py
│
├── algorithms/
│ ├── __init__.py
│ ├── kmeans_clustering.py
│ └── topsis.py
│
├── data/
│ ├── raw_data/
│ └── saved_data/
|
├── main.py
└── requirements.txt
-
Clone the repository: git clone https://github.com/GeorgiLukanov87/analyze_that_car.git cd analyze_that_car/
-
Create a virtual environment (recommended): python -m venv venv
-
Activate the virtual environment:
- On Windows:
venv\Scripts\activate
- On macOS and Linux:
source venv/bin/activate
- On Windows:
-
Install the required packages: pip install -r requirements.txt
-
Ensure you have the necessary data files:
- Place your car dataset CSV file in the
data/raw_data
directory, for easy access.
- Place your car dataset CSV file in the
- Python 3.7+
- Streamlit
- Pandas
- NumPy
- Matplotlib
- Scikit-learn
- AgGrid (for Streamlit)
For a complete list of dependencies, refer to the requirements.txt
file.
streamlit run analyze_that_cars\main.py
You can now view your Streamlit app locally in your browser.
Local URL: http://localhost:8501 Network URL: http://192.168.100.12:8501