This project utilizes Natural Language Processing (NLP) techniques, specifically VADER Sentiment Analysis, to predict the behavior of Indian Market Indices with significant accuracy. The model is trained on a dataset containing news headlines and corresponding stock index details.
The dataset comprises two main files:
- News: Contains columns for Date, Title (Headline), and Description.
- Index details: Includes columns for Date, Value (prices): Opening, High, Low, Closing, and a label to indicate gain/loss for that day.
The dataset used here pertains to the NIFTY50 index of the Indian Stock Market.
To test the project:
- Download the
.ipynb
file and run it on Colab. - Upload the dataset files provided in this repository (an alternate dataset for the Dow Jones index is also available).
- Run all the cells sequentially.
Ensure you have the following Python libraries installed:
pandas
numpy
nltk
Contributions to this project are welcome! If you have suggestions for improvements or find any issues, please feel free to open an issue or submit a pull request.