Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with data prep #2

Open
AIAdventures opened this issue Apr 22, 2017 · 3 comments
Open

Issue with data prep #2

AIAdventures opened this issue Apr 22, 2017 · 3 comments

Comments

@AIAdventures
Copy link

Hi Jim!
Great project!
I am just having trouble with the prep data moudule.
Running it on linux mint.

andrewcz@andrewcz-PORTEGE-Z30t-B ~/Desktop/Numerai/numerai dataset/numerai_datasets (13)/numerai $ python prep_data.py
/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
Fold #1
Traceback (most recent call last):
File "prep_data.py", line 85, in
main()
File "prep_data.py", line 50, in main
rf.fit(X_split_train, y_split_train)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/ensemble/forest.py", line 247, in fit
X = check_array(X, accept_sparse="csc", dtype=DTYPE)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/utils/validation.py", line 382, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: 'test'

Many thanks for your help,
Andrew

@GillesVandewiele
Copy link

The data format has changed since last year. There are some columns that need to be dropped.

I used this in tournament 72: feature_cols = ['feature'+str(i) for i in range(1, 22)]

@jimfleming
Copy link
Owner

Yes, this code is pretty out of date now. I may update in the future as time allows.

@GillesVandewiele
Copy link

Hey @jimfleming I adapted parts of your code to work with the current format. I'll try sending a PR in the nearby future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants