llm-GPT-Alpaca-BERT-financial-sentiment-classification

Large Language Model (LLM) sentiment analysis of financial news using Alpaca, BERT and ChatGPT

An experiment with the fascinating potential of large language models to efficiently classify short news summaries and headlines into 'positive', 'neutral' and 'negative' sentiments. Here we use Alpaca, BERT and ChatGPT to:

import news headings and summaries for specific stock symbols
use the language (NLP) model Bidirectional Encoder Representations from Transformers (BERT) to tokenize and classify the news data into sentiments
classify the same news data using OpenAI's gpt-3.5 to do the same

About this repo

A jupyter notebook (`llm_sentiment_classifier.ipynb`) which takes in financial news for ticker (AKA stock symbols) and returns sentiments is provided. Final results should look something like this:

Approach steps:

Step 0 - Install the necessary libraries - if they are not already installed. This is followed by importing the necessary libraries. Langchain sometimes has issues with dependencies and it is recommended to install with upgrade

pip -q install alpaca-trade-api alpaca-py transformers openai tiktoken
pip -q install langchain --upgrade

1 - Import keys saved in your environment

Most APIs provide a security option to ensure that you can store your authentication details in environment variables. This enables you:

to authenticate your logins and code tracking in code depos such as GitHub
in addition to #1, to control your privileges such as OpenAI's tokens
to protect yourself from inadvertently exposing your secret IDs and keys in any environment such as live trading platforms such as Alpaca

For best practices read:

openai.api_key = os.environ["OPENAI_API_KEY"]
API_KEY = os.getenv("APCA_API_KEY_ID")
API_SECRET = os.getenv("APCA_API_KEY_SECRET")
print('keys imported')

2 - Setup your llm model and construct a pydantic base model

Pydantic defines objects via models (classes which inherit from pydantic.BaseModel).

More info here: https://docs.pydantic.dev/latest/usage/models/ which states:

These models are similar to Python's dataclasses with some differences that streamline certain workflows related to validation, serialization, and JSON schema generation. Untrusted data can be passed to a model and, after parsing and validation, Pydantic guarantees that the fields of the resultant model instance will conform to the field types defined on the model.

3 - Prepare sentiment analysis models and pipeline

Approach and cautionary note:

we will use a fine-tuned (trained for financial news) Hugging Face model (BERT) to analyze the article's headline and summary sentiment
initial model downloads might take some time

Next, create an alpaca client, choose stock tickers and decide how many days of news to scrape

4 - Create a functions to classify individual news and calculate an average sentiment per ticker and to process all tickers

Important considerations:

more recent news might be more relevant
the sentiment confidence also gives us a clue how certain the algorithm is about its classification

Approach:

weigh recent news more heavily (straightforward linear increase, going from old to new - although there can be many variations of this approach such as inverted/hyperbolic, linear-with-noise, etc. approaches)
use sentiment confidence to adjust our weights. i.e. multiply recency score with the score
the function sentiment_to_weighed takes care of the weighing
the function sentiment_analysis takes in a list of tickers and returns a weighed sentiment per ticker
since OpenAI's tocken allotments deplete quickly, a few lines in sentiment_analysis are commented out and a flag is given. Uncomment them if you have enough tokens left (after changing do_llm flag to "1"). Note that llm sentiment classification is slow.

What you should see for each ticker:

Contributing and Permissions

Please do not directly copy anything without my concent. Feel free to reach out to me at https://www.linkedin.com/in/mulugeta-semework-abebe/ for ways to collaborate or use some components.

License

langchain under MIT and Alpaca trade api under Apache License 2.0. Please view LICENSE and (https://www.apache.org/licenses/LICENSE-2.0) for more details. For other packages click on corresponding links at the top of this page (first line).

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets/images		assets/images
README.md		README.md
llm_sentiment_classifier.ipynb		llm_sentiment_classifier.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-GPT-Alpaca-BERT-financial-sentiment-classification

Large Language Model (LLM) sentiment analysis of financial news using Alpaca, BERT and ChatGPT

An experiment with the fascinating potential of large language models to efficiently classify short news summaries and headlines into 'positive', 'neutral' and 'negative' sentiments. Here we use Alpaca, BERT and ChatGPT to:

About this repo

A jupyter notebook (`llm_sentiment_classifier.ipynb`) which takes in financial news for ticker (AKA stock symbols) and returns sentiments is provided. Final results should look something like this:

Approach steps:

Step 0 - Install the necessary libraries - if they are not already installed. This is followed by importing the necessary libraries. Langchain sometimes has issues with dependencies and it is recommended to install with upgrade

1 - Import keys saved in your environment

Most APIs provide a security option to ensure that you can store your authentication details in environment variables. This enables you:

For best practices read:

2 - Setup your llm model and construct a pydantic base model

Pydantic defines objects via models (classes which inherit from pydantic.BaseModel).

3 - Prepare sentiment analysis models and pipeline

Approach and cautionary note:

Next, create an alpaca client, choose stock tickers and decide how many days of news to scrape

4 - Create a functions to classify individual news and calculate an average sentiment per ticker and to process all tickers

Important considerations:

Approach:

What you should see for each ticker:

Contributing and Permissions

License

About

Releases

Packages

Languages

semework/llm-GPT-Alpaca-BERT-financial-sentiment-classification

Folders and files

Latest commit

History

Repository files navigation

llm-GPT-Alpaca-BERT-financial-sentiment-classification

Large Language Model (LLM) sentiment analysis of financial news using Alpaca, BERT and ChatGPT

An experiment with the fascinating potential of large language models to efficiently classify short news summaries and headlines into 'positive', 'neutral' and 'negative' sentiments. Here we use Alpaca, BERT and ChatGPT to:

About this repo

A jupyter notebook (llm_sentiment_classifier.ipynb) which takes in financial news for ticker (AKA stock symbols) and returns sentiments is provided. Final results should look something like this:

Approach steps:

Step 0 - Install the necessary libraries - if they are not already installed. This is followed by importing the necessary libraries. Langchain sometimes has issues with dependencies and it is recommended to install with upgrade

1 - Import keys saved in your environment

Most APIs provide a security option to ensure that you can store your authentication details in environment variables. This enables you:

For best practices read:

2 - Setup your llm model and construct a pydantic base model

Pydantic defines objects via models (classes which inherit from pydantic.BaseModel).

3 - Prepare sentiment analysis models and pipeline

Approach and cautionary note:

Next, create an alpaca client, choose stock tickers and decide how many days of news to scrape

4 - Create a functions to classify individual news and calculate an average sentiment per ticker and to process all tickers

Important considerations:

Approach:

What you should see for each ticker:

Contributing and Permissions

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

A jupyter notebook (`llm_sentiment_classifier.ipynb`) which takes in financial news for ticker (AKA stock symbols) and returns sentiments is provided. Final results should look something like this:

Packages