Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 50 GB Parquet+PyArrow dataset for H2O tests on CI #1531

Closed
wants to merge 7 commits into from

Conversation

hendrikmakait
Copy link
Member

Similar to #1530, the 5 GB dataset feels too small to benchmark behavior we care about.

@fjetter
Copy link
Member

fjetter commented Aug 15, 2024

how long is one test run (compared to the rest of the suite)?

@hendrikmakait hendrikmakait marked this pull request as draft August 15, 2024 12:02
@hendrikmakait
Copy link
Member Author

how long is one test run (compared to the rest of the suite)?

I'll have to convert this back to draft, it looks like some of the workloads run OOM.

@hendrikmakait hendrikmakait marked this pull request as ready for review August 22, 2024 12:54
tests/benchmarks/test_h2o.py Outdated Show resolved Hide resolved
@hendrikmakait hendrikmakait marked this pull request as draft August 22, 2024 14:14
@hendrikmakait
Copy link
Member Author

Some of these workloads cause workers to OOM, so I'm shelving this for now and will look into it at a later point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants