Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add small molecule repex consistency tests #1065

Merged
merged 96 commits into from
Apr 28, 2023
Merged

Conversation

zhang-ivy
Copy link
Contributor

@zhang-ivy zhang-ivy commented Jun 30, 2022

Description

Adds neutral and charge changing small molecule tests (the latter one has not been added yet).

Motivation and context

Resolves #959
Resolves #1061

How has this been tested?

Change log


@zhang-ivy
Copy link
Contributor Author

zhang-ivy commented Jul 2, 2022

@ijpulidos : I've pushed the small molecule neutral and charge tests. The charge test isn't done yet, but I was thinking you could take over from here? The charge test is currently yielding the error described here. Once we figure out how to fix that error, you may have to debug a bit more to make sure the test is running without any more errors and then check that its actually passing.

Note that I had to make some changes to RelativeFEPSetup to make sure that the host (cb7.mol2) can be parametrized properly. These changes are necessary because otherwise RelativeFEPSetup does not consider the host as a small molecule and therefore does not tell the SystemGenerator that it needs to parametrize it as a small molecule.

@zhang-ivy zhang-ivy changed the title [WIP] Add small molecule repex consistency tests Add small molecule repex consistency tests Jul 6, 2022
@zhang-ivy
Copy link
Contributor Author

@ijpulidos : Update: I've fixed the parametrization error in the charge test that I mentioned above

@ijpulidos
Copy link
Contributor

@zhang-ivy Thanks for this! I'll take a look and let you know if I encounter anything.

@ijpulidos
Copy link
Contributor

We need to switch to the new LangevinMiddelIntegrator once we cut a new release of openmmtools.

@zhang-ivy zhang-ivy added this to the 0.10.2 Bugfix release milestone Jul 18, 2022
@zhang-ivy
Copy link
Contributor Author

To speed up this test, see #1058 (comment)

@ijpulidos
Copy link
Contributor

ijpulidos commented Aug 15, 2022

To use the change in openmmtools 0.21.5 that uses LangevinMiddleIntegrator we would have to change the test to use LangevinDynamicsMove instead of LangevinSplittingDynamicsMove, but I see that almost all tests (except one) use the latter. How do we want to approach this? Just changing the move in this test would be enough to test what we want to test?

@ijpulidos
Copy link
Contributor

How do we want to approach this? Just changing the move in this test would be enough to test what we want to test?

From discussions with @zhang-ivy , I think we want to change all of the tests to use LangevinDynamicsMove which in turn would use the LangevinMiddleIntegrator changes in latest openmmtools.

@codecov
Copy link

codecov bot commented Aug 15, 2022

Codecov Report

Merging #1065 (7b16afc) into main (6730065) will decrease coverage by 0.32%.
The diff coverage is 3.90%.

@ijpulidos
Copy link
Contributor

As discussed in our dev sync, I saw the following results

comparison_iteration_times

According to log output it would be taking more than 4 hours for completion, which seems like too much compared to what @zhang-ivy has obtained in other tests.

The script to generate the data is attached. I basically just copied the test_RESTCapableHybridTopologyFactory_repex_charge_transformation and made it a script. To run you can use python test_repex_charge.py 2> stderr_out.txt for example, this would write the stderr output to the specified file. You can generate the plots (assuming you have two log files you want to compare), using the attached make_plots.py barebones script (modify as needed).

We should also try to run this using the nightly versions of Openmm to see further performance boosts.

test_repex_charge.tar.gz

@@ -54,16 +54,15 @@ def test_RESTCapableHybridTopologyFactory_repex_neutral_mutation():
# Set up repex simulation
reporter_file = os.path.join(temp_dir, f"{wt_name}-{mutant_name}.nc")
reporter = MultiStateReporter(reporter_file, checkpoint_interval=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set the checkpoint_interval to be 100 here. There's no need to checkpoint this frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't ever use the positions, so we probably don't need to write to the checkpoint file at all here.
The energies should be written every iteration regardless of the checkpoint_interval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using 100 here for the checkpoint_interval worked okay in the preliminary tests

@@ -157,16 +155,15 @@ def test_RESTCapableHybridTopologyFactory_repex_charge_mutation():
# Set up repex simulation
reporter_file = os.path.join(temp_dir, f"{wt_name}-{mutant_name}.nc")
reporter = MultiStateReporter(reporter_file, checkpoint_interval=10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set the checkpoint_interval to 100 here.

@mikemhenry
Copy link
Contributor

Tests took 6hr30min https://github.com/choderalab/perses/actions/runs/4747425819/jobs/8432377018

with only one failure:

FAILED perses/tests/test_repex.py::test_RESTCapableHybridTopologyFactory_repex_charge_mutation - FileNotFoundError: [Errno 2] No such file or directory: '/actions-runner/_work/perses/perses/perses/data/ala_solvated.cif'

@ijpulidos
Copy link
Contributor

@mikemhenry that's great news. The failing test is just that we have the inverse order in the transformation, I made the changes but forgot to push them, these will add a few hours to the tests but I think we should be okay. I'll push them in a second.

env:
TEST_MODE: GPU
OPENMM: ${{ matrix.cfg.openmm }}
OPENMM: 8.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need this for when we merge in #1186

@zhang-ivy
Copy link
Contributor Author

@ijpulidos : I just reviewed again and your changes look good to me.

A couple of other things to address:

@ijpulidos
Copy link
Contributor

Tests took 6hr30min https://github.com/choderalab/perses/actions/runs/4747425819/jobs/8432377018

with only one failure:

FAILED perses/tests/test_repex.py::test_RESTCapableHybridTopologyFactory_repex_charge_mutation - FileNotFoundError: [Errno 2] No such file or directory: '/actions-runner/_work/perses/perses/perses/data/ala_solvated.cif'

@mikemhenry Last run of the manually run GPU CI tests was successful https://github.com/choderalab/perses/actions/runs/4766605648/jobs/8473881768#step:9:571

@zhang-ivy thanks for the comments/review. I'll be adressing these as soon as possible.

Copy link
Member

@jchodera jchodera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GPU tests took 10 hours (~$5 per run), with 8.5 hours spent in these two tests:

16633.61s call     perses/tests/test_repex.py::test_RESTCapableHybridTopologyFactory_repex_charge_mutation
13661.38s call     perses/tests/test_repex.py::test_RESTCapableHybridTopologyFactory_repex_charge_transformation
  • Are we sure the tests are running on the GPU? We aren't logging debug output that would report the Platform that the tests are running on, but we should probably confirm this.
  • Is there any way we can reduce the system size for these tests so that runtime is shortened?
  • I've highlighted lines where we might be able to halve the number of steps per iteration, which would cut the runtime in half.

If we can't reduce the cost drastically, we might have to only enable these tests in special circumstances, such as once per day on main.

perses/tests/test_repex.py Outdated Show resolved Hide resolved
perses/tests/test_repex.py Outdated Show resolved Hide resolved
perses/tests/test_repex.py Outdated Show resolved Hide resolved
perses/tests/test_repex.py Outdated Show resolved Hide resolved
@ijpulidos
Copy link
Contributor

Good news, we can use 125 steps per iteration and still have the tests passing, we probably could do a bit of more refining here to lower the walltimes, but we moved from ~10 hours, to ~6.5 hours in the latest manual run of these tests here.

@ijpulidos
Copy link
Contributor

And as we discussed in our dev sync, these tests would only be run on a nightly basis, and maybe we can discuss about running them less frequently if we need to. Monitoring the costs here shouldn't be hard on the AWS dashboard.

@ijpulidos ijpulidos requested a review from mikemhenry April 27, 2023 20:57
@mikemhenry mikemhenry enabled auto-merge (squash) April 28, 2023 17:45
@mikemhenry mikemhenry merged commit 8e3e423 into main Apr 28, 2023
@mikemhenry mikemhenry deleted the add-small-molecule-test branch April 28, 2023 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Unit tests
Projects
None yet
4 participants