-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working with big data #136
Comments
@Utayaki, thank you very much for your support of HyperparameterHunter and for raising this issue! Unfortunately, for now, HyperparameterHunter doesn’t handle automatically loading big datasets in batches. I would absolutely love to add support for this, but my todo list is getting quite long, and I’m completely focused on finalizing v3.0.0 at the moment. So I fear it may be a while unless you (or someone following this issue) is willing to work with me on a solution or submit a PR. If your dataset isn’t publicly available, can you refer me to a large dataset I can use for testing? Forgive me if this is a silly question, as I’m quite unfamiliar with processing big data, but is it possible for you to use Thanks again for raising this issue. Until we find a solution, I'd like to keep it open to see how many others have the same problem and want to help fix it. |
@HunterMcGushion I will be glad to work with your library on that issue to make it better, that will be a very interesting experience for me. Talking about a big data, how would you like me to give it for testing? Have a nice day and again, sorry for keeping you waiting for so long! |
No worries! Aside from the documentation, which you've already seen, I'd recommend a few other resources:
For an example of how to implement your own
I think it's worth mentioning that the callbacks defined for the library in the Also note that you probably don't need to worry about the Sorry about that wall of text. If you have any other questions, please don't hesitate to ask here. I'm more than happy to help however I can! Edit: If you're still using hyperparameter_hunter v2.2.0 (or lower), you should consider switching to the most recent v3.0.0 alpha build. 3.0.0 completely overhauls how datasets are handled via the |
@HunterMcGushion
That's a decision I came up with, when I was reading your examples. It seems to be pretty easy to code. Am I right in my statements above? Have a nice day! |
Yeah, that sounds about right if you're using the Again, I'm not sure how this will work out, but feel free to post code here, or to fork the repo and create a new branch, so we can discuss it more easily. |
I was wrong. I’m sorry, but after looking into it a bit, I’ve just realized that this will require changes to |
Your library is superious! I want to use it on my main project.
However I'm working with Big Data and I saved numpy arrays in batches to my folder, because I can't store array fulfully on my RAM, while training. However, as I see, the library needs to load the dataframe completely, what I unfortunately can't do.
Is there a solution to that problem?
The text was updated successfully, but these errors were encountered: