Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a possible fix to historical quotes #48

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 26 additions & 23 deletions ystockquote.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@
# py2
from urllib2 import Request, urlopen
from urllib import urlencode
from datetime import datetime, timedelta
import json
import re


def _request(symbol, stat):
Expand Down Expand Up @@ -470,32 +473,32 @@ def get_historical_prices(symbol, start_date, end_date):
Returns a nested dictionary (dict of dicts).
outer dict keys are dates ('YYYY-MM-DD')
"""
epoch = datetime(1970, 1, 1)
p1 = datetime.strptime(start_date, '%Y-%m-%d')
p2 = datetime.strptime(end_date, '%Y-%m-%d')
params = urlencode({
's': symbol,
'a': int(start_date[5:7]) - 1,
'b': int(start_date[8:10]),
'c': int(start_date[0:4]),
'd': int(end_date[5:7]) - 1,
'e': int(end_date[8:10]),
'f': int(end_date[0:4]),
'g': 'd',
'ignore': '.csv',
'period1': int((p1 - epoch).total_seconds()),
'period2': int((p2 - epoch + timedelta(days=1)).total_seconds()),
'frequency': '1d',
'filter': 'history',
})
url = 'http://real-chart.finance.yahoo.com/table.csv?%s' % params
url = 'https://finance.yahoo.com/quote/%s/history?%s' % (symbol, params)
req = Request(url)
resp = urlopen(req)
content = str(resp.read().decode('utf-8').strip())
daily_data = content.splitlines()
content = resp.read()
quotes = re.findall('{"date":\d+[^}]+}', content)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you would need to call .decode() on content before passing it to re.findall.
It seems that you would also have to check if it's Python 3 and call .decode() only if it's Python 3. There may be other, more elegant, way of doing this -- I'm not used to port Python code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes adding .decode('utf-8').strip() seems to get past this

hist_dict = dict()
keys = daily_data[0].split(',')
for day in daily_data[1:]:
day_data = day.split(',')
date = day_data[0]
hist_dict[date] = \
{keys[1]: day_data[1],
keys[2]: day_data[2],
keys[3]: day_data[3],
keys[4]: day_data[4],
keys[5]: day_data[5],
keys[6]: day_data[6]}
for quote in quotes:
j = json.loads(quote)
for k in ('open', 'close', 'high', 'low', 'unadjclose'):
j[k] = "%.2f" % j[k]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ystockquote.get_historical_prices('GOOGL', '2006-01-01', '2017-05-29') would throw KeyError: 'open' at this line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

starforever I do not get that error either in my own tests or running the test suite.
(note I have not tested in Python 3.x)
Does it happen in Python 3.x?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using Python 2.7. I'm not sure if you are using the exact same arguments. It only happens for some specific symbol and time ranges. I haven't got time to dive deep on this. So no idea how it happens.

Copy link

@starforever starforever May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing it doesn't throw the error in the tests since the specific arguments are not used in tests. I silently caught the exception and it worked fine then. But I think this is only a quick fix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see the issue, when the quote/date range includes a split or dividend the pattern match pulls it in but there is no 'open', 'close' etc.
I think I may add to the tests a request that includes a split and will hard code some values to also check for the adj/un adj case you noted in another comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yahoo's current data does not include un-adjusted prices anymore
All prices are adjusted for splits
I had to stop using yahoo a while ago b/c of this

d = timedelta(seconds=j["date"])
# these keys are for backwards compatibility
for key in j.keys():
j[key.capitalize()] = j[key]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im getting an error running the tests on this line. not sure what you are trying to do here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at #52

# in the old api "Close" was the unadj close
j['Close'] = j['unadjclose']
j['Adj Close'] = j['close']
date = epoch + d
hist_dict[date.date().isoformat()] = j
return hist_dict