Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: better and more tests #57

Merged
merged 4 commits into from
Aug 20, 2024

Conversation

khsrali
Copy link
Contributor

@khsrali khsrali commented Aug 13, 2024

Changes include:

  • tests/conftest.py

    • MockFirecrest
      • keep track of submitted jobs through Slurm.all_jobs
      • implemented cancel() for killing jobs
      • in poll_active when job_id is not found, raise FirecrestException just like firecrest does. --this includes when the job is finished.
    • removed redundant conftest.Values
  • tests/test_calculation.py::test_calculation_basic

    • support for account number which is useful for tests against real server
  • tests/test_scheduler.py

    • added tests for transport.kill_jobs
    • overall improvement in the logic. Now, all test_scheduler.py tests pass against server

@khsrali khsrali requested a review from agoscinski August 13, 2024 11:25
Copy link

codecov bot commented Aug 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.32%. Comparing base (a147f87) to head (deb4593).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #57      +/-   ##
==========================================
- Coverage   79.97%   79.32%   -0.65%     
==========================================
  Files           4        4              
  Lines         674      677       +3     
==========================================
- Hits          539      537       -2     
- Misses        135      140       +5     
Flag Coverage Δ
pytests 79.32% <100.00%> (-0.65%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Looks good, just some minor comments

aiida_firecrest/transport.py Outdated Show resolved Hide resolved
@@ -35,6 +36,10 @@ def test_calculation_basic(firecrest_computer: orm.Computer):
builder = code.get_builder()
builder.x = orm.Int(1)
builder.y = orm.Int(2)
builder.metadata.options.account = firecrest_config.builder_metadata_options_account
builder.metadata.options.custom_scheduler_commands = (
"#SBATCH --constraint=mc\n#SBATCH --mem=10K"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this constraint mc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for CSCS it means, calculations to run on multicore nodes, as opposed to gpu nodes:
https://user.cscs.ch/access/running/

We can get this from firecrest_config as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, maybe you can link this here and write a comment that we need this to test against the real server of cscs. If the docker image works again these might change right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line of the code doesn't need to change in that case. Only in .firecrest-demo-config.json one can leave the field empty. So I think it's alright.

tests/test_scheduler.py Outdated Show resolved Hide resolved
tests/test_scheduler.py Outdated Show resolved Hide resolved
@@ -383,6 +400,7 @@ class ComputerFirecrestConfig:
temp_directory: str
workdir: str
api_version: str
builder_metadata_options_account: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can add a small docstring for this class, this variable is not so clear to me. I think most variables are explained in the README and can copy pasted to here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, docstring added.

builder_metadata_options_custom_scheduler_commands are "custom" settings that users had to define while submitting jobs. These are cluster specific variables and historically people using aiida were to set these variables in their workflow, etc..
an example of this variable is like this:
["#SBATCH --account=mr32", "#SBATCH --constraint=mc", "#SBATCH --mem=10K"]

We didn't need to do this for tests against a mocking server, but for tests against a real server these values are needed.

Note, we don't set this for the entire computer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not so related to this PR, but for my clarification it would be nice. Usually AiiDA uses prepend-text option for adding this SBATCH commands into the script but for firecrest the SBATCH commands have to be made more explicit, therefore we have custom_scheduler_commands?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I haven't explored that prepend-text variable when creating a new computer in aiida.

But you are right, some of these could be put there, the only thing is that, then it's always the same for all jobs in that computer.
Maybe one would want to run some jobs on gpu and some other mc, or different accounts for billing, etc..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case you would define multiple computers, but I am not sure about best practices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'm a fan of having these things editable. e.g. if one has to change his billing account, shouldn't need to define a new computer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for mc & gpu, probably I agree, I don't really know

tests/conftest.py Outdated Show resolved Hide resolved
@@ -35,6 +36,10 @@ def test_calculation_basic(firecrest_computer: orm.Computer):
builder = code.get_builder()
builder.x = orm.Int(1)
builder.y = orm.Int(2)
builder.metadata.options.account = firecrest_config.builder_metadata_options_account
builder.metadata.options.custom_scheduler_commands = (
"#SBATCH --constraint=mc\n#SBATCH --mem=10K"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, maybe you can link this here and write a comment that we need this to test against the real server of cscs. If the docker image works again these might change right?

@agoscinski agoscinski self-requested a review August 20, 2024 05:31
Copy link
Contributor

@agoscinski agoscinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left few comments, can you resolve them and then merge? I give you already approval since it is not so important.

Co-authored-by: Alexander Goscinski <[email protected]>
@khsrali
Copy link
Contributor Author

khsrali commented Aug 20, 2024

Cheers @agoscinski, I applied/answered your suggestions

@khsrali khsrali merged commit 151e5a6 into aiidateam:main Aug 20, 2024
7 checks passed
@khsrali khsrali deleted the sheduler_better_tests branch August 20, 2024 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants