This project offers a comprehensive solution for sales data analysis and visualization. It seamlessly integrates data ingestion from CSV files and web scraping, providing flexibility in data sourcing. Users can leverage the power of Pandas and Streamlit for efficient data processing and interactive exploration. A unique AI-powered chat feature allows users to ask natural language questions about their data, gaining deeper insights. Dynamic charting capabilities enable the creation of compelling visualizations, revealing trends and patterns. Furthermore, the project employs AI to automatically structure scraped web data into organized tables, streamlining the analysis process. This project empowers users to unlock the full potential of their sales data, regardless of its origin, through a user-friendly and intelligent interface.
-Flexible Data Ingestion: Import sales data from CSV files. The project allows users to easily change the working directory for seamless file access. Additionally, it incorporates web scraping capabilities to extract data from various websites, expanding the range of data sources.
-Powerful Data Processing with Pandas and Streamlit: Leveraging the robust data manipulation capabilities of the Pandas library within an interactive Streamlit interface, users can explore and process data efficiently. This includes data cleaning, transformation, and aggregation.
-Interactive Data Exploration with AI-Powered Chat: A unique feature of this project is its integrated AI chat functionality. Users can ask natural language questions about their CSV data and receive insightful responses, facilitating data understanding and exploration. This provides a user-friendly way to interact with and glean insights from the data.
-Dynamic Data Visualization: Create compelling visualizations of sales data using a variety of chart types. This feature allows users to identify trends, patterns, and outliers within their data, enhancing data interpretation and communication.
-AI-Powered Table Generation from Web Data: Scraped web data can be automatically structured into organized tables using AI, simplifying data analysis and reporting. This feature streamlines the process of converting unstructured web data into a usable format.
- Python 3.x
- pandas
- Streamlit
- Seleiumn
- ollama with model Llama 3.2 7B by Meta
- Clone the repository:
git clone https://github.com/vy-phan/WebScraping.git
- Navigate to the project directory:
cd datavis
- Install the required packages:
pip install -r requirements.txt
This guide outlines the steps to install Ollama and the LLaMA 3.2 model.
- Download the Ollama installer from the official website: https://ollama.ai/download
- Run the installer and follow the on-screen instructions to complete the installation.
- Open a terminal or command prompt.
- Run the following command to install the LLaMA 3.2 model:
ollama pull llama-3.2
- Run the following command to verify that the model has been installed correctly:
ollama list
You should see llama-3.2 listed among the installed models.
After successful installation and verification, you can start a chat session with the LLaMA 3.2 model using:
ollama run llama-3.2