Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictions #15

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions predictions_FB.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from fbprophet import Prophet
import yfinance as yf
import math
import os
import sys
Comment on lines +7 to +8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be interacting with the system or calling any lower level functions since our packages only serve high level abstract functions. I would think about another way to suppress the logs. Perhaps try overriding the functions from fbprophet?

import pandas as pd
ApurvShah007 marked this conversation as resolved.
Show resolved Hide resolved
pd.options.mode.chained_assignment = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be added at the root? Perhaps we could move this inside a function using this?



class suppress_stdout_stderr(object):
'''
Used to supress the verbose output of the inner workings of the FB prophet model
'''
def __init__(self):
self.null_fds = [os.open(os.devnull, os.O_RDWR) for x in range(2)]
self.save_fds = (os.dup(1), os.dup(2))

def __enter__(self):
os.dup2(self.null_fds[0], 1)
os.dup2(self.null_fds[1], 2)

def __exit__(self, *_):
os.dup2(self.save_fds[0], 1)
os.dup2(self.save_fds[1], 2)
os.close(self.null_fds[0])
os.close(self.null_fds[1])
Comment on lines +16 to +28
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we shouldn't be interacting with any os methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way to supress the output seems to be this. This is on the official fbprophet repo issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this? This is from the same issue thread. Can you try this out?


def fb_predict(dfi, plot = False, days =1):
ApurvShah007 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add type hints to each parameter to help the calls understand what kind of arguments are to be sent?

These are typically defined using the : operator and you can even assign a default value using = in combination with this.

So for this, I imagine it would be somewhat like:

def fb_predict(dfi: file, plot: bool=False, days: int=1, verbosity: bool=True):

'''
This function takes in:
dfi - A dataframe with just the column of close price named as Close and date as index
plot - Whether you want to plot the resulyts or not( Default is False)
days - Number of days into the future you want to predict (Dafault is 1)

This function returns:
mean_err - The mean percentage error in calculations
pred - A pandas datafrmae with predictions, upper and lower bounds

'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job documenting the method! If you use PyCharm and initialize a multiline comment (doc string), it'll create the fields for documentation for you!

The benefit of doing that is that you can access the documentation of a function using the __doc__ property. So for fb_predict, you can access the documentation for this method by fb_predict.__doc__

Typically, you'd want to add @:param for each parameter and @:return to describe what is being returned.

This article should help you get started.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, have a look and let me know if it looks good now.

df = dfi.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since dfi isn't used anywhere else (correct me if I'm wrong), I think we can use it directly than copying its value in df

arr = [i for i in range(len(df))]
ar = df.index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we directly use df.index (unless its value's changing)? That would save us an unnecessary state.

df.index= arr
df['ds'] = ar
df = df[['ds', 'Close']]
df.rename(columns = {"Close":'y'}, inplace = True)

train_length = math.floor(0.1*len(df))
train = df[:-train_length]
test = df[-train_length:]
m = Prophet(daily_seasonality = True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a more descriptive name to this?

with suppress_stdout_stderr():
m.fit(train)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can override the fit function to manipulate the logs?

future = m.make_future_dataframe(periods=train_length)
prediction = m.predict(future)
test_predict = prediction[-train_length:]
test['predict'] = test_predict['trend']
test['error'] =((test['y'] - test['predict'])/test['y'])*100
mean_err = round(test['error'].mean(),3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep the test models in production?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been implemented to calculate the error by splitting the data into test and train as the main model trains on the entire historic data. If we have some other workaround then I would love to remove the extra model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't need this in the final thing, I'd say we remove this. If you want it in development, I'd recommend having the test model in another file which runs within the fb_predict method. Then you could add that file to the .gitignore so that it isn't committed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we could even leave it be (in another file or method) and have another flag (in the method signature) to take a boolean param for calculating the error so that the test model isn't run every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yes I can do that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make it False by default so if nothing is mentioned, it doesn't run

mn = Prophet(daily_seasonality = True)
with suppress_stdout_stderr():
mn.fit(df)
future = mn.make_future_dataframe(periods=days)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add more options for different types of models as we discussed.

Copy link
Contributor Author

@ApurvShah007 ApurvShah007 Aug 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we can make like different functions corresponding to different models so the user can simply call different functions to use the different models

prediction = mn.predict(future)
pred = pd.DataFrame(prediction[-days:][['ds','trend', 'yhat_upper', 'yhat_lower']])
pred.rename(columns = {'ds' : 'Date', 'trend':'Guess', 'yhat_lower':'Lower Bound', 'yhat_upper':'Upper Bound'}, inplace = True)
pred.set_index(np.arange(1,len(pred)+1), inplace = True)
if plot:
m.plot(prediction)
plt.title("Prediction of the AZPN Stock Price using the Prophet")
plt.xlabel("Date")
plt.ylabel("Close Stock Price")
plt.show()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good candidate to factor into a helper function so that other functions can also use down the line.

return mean_err, pred

#Sample Run
# symbol = "AZPN"
# days = 5
# stock = yf.Ticker(symbol)
# df = stock.history(period="max")
# df = df[['Close']]
# me , pred= fb_predict(df, days = days)
# print("\n The prediction for {} after {} days is :\n".format(symbol, days))
# print(pred)
# print("\n Mean Percentage Error : ", me)
ApurvShah007 marked this conversation as resolved.
Show resolved Hide resolved