-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] lightgbm and CrossValidator are not compatible #2323
Comments
@mhamilton723 @memoryz @mhamilton723 @dylanw-oss @svotaw @imatiach-msft Thanks again if you have time to assign this task and give me some suggestions. I represent those users have same issues to highly appreciate your time and effort! |
by the way, the dataset is around 7 million, and 700 more features. The data type are numerical and VectorUDT() which is transferred from FeatureHash. All null value and error data is excluded, because I could run our lightgbmClassifier successfully.
|
Finally succeed by using enclosed structure, but have a low processing time and don't show progress bar on Databricks. Could anyone try to reproduce this bugs? I believe cross-validation is important to avoid over-fitting. Thank you so much if there is anyone could help me out.
|
SynapseML version
synapseml_2.12:1.0.8
System information
Describe the problem
I tried to combine synpase.ml.lightgbm with CrossValidator and hyperopt on Databricks. There are some trials and issues:
1. tried hyperopt and lightgbm, failed after 43 iterations, and pop out an message:
2.tried lightgbm with hyperopt and CrossValidator
there are not progress bar after using CrossValidator, if I limit the data size into 1000,
I succeed if the data is limited into 1000, but failed to feed the whole data. Even in the 1000 rows of data, the speed is extremly slow and Databricks doesn't show progress bar, which means it may not utilize parallel computing. I tried batches, autoScaling, numTasks and dynamicAllocation, also barrier. But all of those don't work for me.
error message:
Code to reproduce issue
Other info / logs
No response
What component(s) does this bug affect?
area/cognitive
: Cognitive projectarea/core
: Core projectarea/deep-learning
: DeepLearning projectarea/lightgbm
: Lightgbm projectarea/opencv
: Opencv projectarea/vw
: VW projectarea/website
: Websitearea/build
: Project build systemarea/notebooks
: Samples under notebooks folderarea/docker
: Docker usagearea/models
: models related issueWhat language(s) does this bug affect?
language/scala
: Scala source codelanguage/python
: Pyspark APIslanguage/r
: R APIslanguage/csharp
: .NET APIslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/synapse
: Azure Synapse integrationsintegrations/azureml
: Azure ML integrationsintegrations/databricks
: Databricks integrationsThe text was updated successfully, but these errors were encountered: