Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data and Empirics: Pandas Exercise Error #4

Open
duncangh opened this issue Aug 2, 2017 · 2 comments
Open

Data and Empirics: Pandas Exercise Error #4

duncangh opened this issue Aug 2, 2017 · 2 comments

Comments

@duncangh
Copy link

duncangh commented Aug 2, 2017

In the first exercise, where the goal is to calculate the percentage price change over the year 2013 for an array of tickers, it appears the graph showing the solution is incorrect. The code appears to work, albeit not the most eloquent solution. The "solution" graph appears to show AAPL price change of ~-50% when really Apple appreciated in 2013.

Here is a cleaner implementation of the solution that leverages more of pandas' great functionality.

ticker_list = {'INTC': 'Intel',
               'MSFT': 'Microsoft',
               'IBM': 'IBM',
               'BHP': 'BHP',
               'TM': 'Toyota',
               'AAPL': 'Apple',
               'AMZN': 'Amazon',
               'BA': 'Boeing',
               'QCOM': 'Qualcomm',
               'KO': 'Coca-Cola',
               'GOOG': 'Google',
               'SNE': 'Sony',
               'PTR': 'PetroChina'}

syms = list(ticker_list.keys()) # use to filter columns

path = 'https://github.com/QuantEcon/QuantEcon.lectures.code/raw/master/pandas/data/ticker_data.csv'

# set the index col to the date column and parse it as datetime type. 
# Slice out only relevant columns
df = pd.read_csv(path, index_col=0, parse_dates=[0]).loc[:, syms]

# resample by year (2013), take the first value and last value. 
# use pandas builtin pct_change() method across the axis of first/last share price value
answer = 100 * pd.concat([df.resample('A').first(), df.resample('A').last()]).pct_change().dropna()

# replace ticker symbols with company names for aesthetics
answer.rename(columns=ticker_list, inplace=True)

# Create bar plot of percent change
answer.T.plot.bar()
@jstac
Copy link

jstac commented Aug 2, 2017

Thanks @duncangh, much appreciated! We'll review this and I'm sure it will benefit from your feedback.

@natashawatkins
Copy link
Member

Hi @duncangh thanks for the feedback! You're correct - the plot was wrong.

I agree .pct_change() is a cool method, although in this example it's a little tricky to use as we want to calculate it for only the first and last day of 2013. Your method works well, however it becomes tricky to sort as the result is a DataFrame, and therefore we'd need to rename the column etc.

I have gone ahead and updated the code a little to get rid of the dictionary to Series conversion though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants