Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data pre 2014 for OES: thoughts and musings #6

Open
mbkupfer opened this issue Nov 1, 2018 · 1 comment
Open

Data pre 2014 for OES: thoughts and musings #6

mbkupfer opened this issue Nov 1, 2018 · 1 comment

Comments

@mbkupfer
Copy link
Owner

mbkupfer commented Nov 1, 2018

The main inconsistency prior to 2013 is in the msa files. They are broken up into 3 sections (reason being that there were limitations in older excel versions.) For example, here are the OES files for each MSA in 2013:

MSA_M2013_dl_3_OH_WY.xls
MSA_M2013_dl_1_AK_IN.xls 
MSA_M2013_dl_2_KS_NY.xls

This can all be combined into one dataframe using pd.concat()
For example:

df1 = pd.read_excel(zip_ma.open('MSA_M2013_dl_3_OH_WY.xls'))
df2 = pd.read_excel(zip_ma.open('MSA_M2013_dl_1_AK_IN.xls'))
df3= pd.read_excel(zip_ma.open('MSA_M2013_dl_2_KS_NY.xls'))
df_all = pd.concat([df1, df2, df3], sort=False)

All other files appear to look the same.

@mbkupfer
Copy link
Owner Author

mbkupfer commented Nov 1, 2018

df_metros_1 = pd.read_excel(zip_ma.open(f'MSA_M20{yy}_dl_1_AK_IN.xls'))
df_metros_2 = pd.read_excel(zip_ma.open(f'MSA_M20{yy}_dl_2_KS_NY.xls'))
df_metros_3 = pd.read_excel(zip_ma.open(f'MSA_M20{yy}_dl_3_OH_WY.xls'))
df_metros = pd.concat([df_metros_1, df_metros_2, df_metros_3], sort=False)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant