Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should it be possible to interpolate from other features in the imputing. impute_all_assets_by_correlation function? #301

Open
achenry opened this issue Oct 21, 2024 · 1 comment

Comments

@achenry
Copy link

achenry commented Oct 21, 2024

It looks like whatever reference_col argument is passed to imputing. impute_all_assets_by_correlation, the reference_col then passed to the internally called impute_data function is set to impute_col...

reference_col=impute_col,

@ejsimley
Copy link
Collaborator

Hi @achenry, you're right that reference_col should be passed to the reference_col argument of impute_data instead of impute_col. So that is a quick fix. I also noticed that when determining the most highly-correlated asset to use for imputing, the correlation coefficients between assets are determined using impute_col only:
corr_df = asset_correlation_matrix(data, impute_col)

I think for reference_col to get used as intended, the correlation coefficients should be based on the correlation between reference_col on one asset and impute_col on the other. Does that make sense? I'll try to modify asset_correlation_matrix to give the reference-to-impute column correlation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants