-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature restart #5
Conversation
Adds a restart flag that checks the output_directory for pre-existing sire systems, if they are present they are loaded and used as a starting point for the runner. In this case runtime represents the total time for the system. Also adds a supress_overwrite_warning flag that supresses the need for the user to press enter before files are overwritten in the case of restart=False.
…is used. [ci skip]
…point file is found
Fixes a bug in which the charge scale factor is referenced before being set during the setup of a charge scaled morph, this simply sets it to a temporary value which is then overwritten when charge scale factor is set
…t cannot be changed between runs throw the relevant error. Options that cannot be changed are the following: 'runtime', 'restart', 'temperature', 'minimise', 'max_threads', 'equilibration_time', 'equilibration_timestep', 'energy_frequency', 'save_trajectory', 'frame_frequency', 'save_velocities', 'checkpoint_frequency', 'platform', 'max_threads', 'max_gpus', 'run_parallel', 'restart', 'save_trajectories', 'write_config', 'log_level', 'log_file', 'supress_overwrite_warning'
…iting still exist.
…itself at the correct level. Also adds a test for the writing of the logfile
Added extra parameter to config.as_dict to make it equivalent to sire outputs. Restarts are checked against previous config, if this config doesn't exist the runner checks the checkpoint file for lambda=0 for the previous config. yaml safe load throws a different exception on windows - this was causing the test to fail. Should now be fixed.
…o Path in as_dict, removing lambda symbols on windows, changed filenames in test_logfile_creation
Config encoded in to checkpoint files must match either the config.yaml file or the config from the lambda=0 window
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @mb2055, this looks great. There are just a few minor comments and things to address.
Other than those, my only thought is whether we want to make the system
command line option into an optional argument rather than a positional argument, which is required. That way you could simply omit the system if running a restart and we'll just check the output directory. In this case you would run somd2
as:
# First run.
somd2 --system some_system.s3
# Second run. (Will check and use output in the output directory.)
somd2 --restart --runtime 10ns
This avoids the confusion of the user passing in a different system to the one in the checkpoint files from the output directory. If they pass a system as well, then we could just warning that they are performing a restart and it'll be ignored.
What do you think?
I considered making |
That's a good point. For now I think it's easiest to keep things as they are. I imagine the use case might be just re-running the same command multiple times in a loop, e.g. to re-run until all windows complete, in which case the command-line will be the same. We should probably document the behaviour somewhere, though. This would also mean that we probably need to do some basic system comparison checks too, i.e. same number of molecules, residues, atoms, etc. |
Changes/removal of comments in both config and test files. Removal of redundant variable in config. lam_sym is now _lam_sym and is only declared in runner.py, being imported yb dynamics.py. Temperature can now no longer be changed between restarts, an appropriate test has been added to ensure this functionality. Extra_args is no longer explicitly checked in the _compare_configs function, and it now ignored by the _to_dict function of _config.py.
…ed to raise a warning here as the user is warned in the runnner before any files are removed. [ci skip]
Thanks, the updates look good. Is it possible to add a simple check that the checkpoint systems are the same as the one passed from the command line? you could first validate that the UID of each checkpoint is the same, then following this make sure that it has the same number of molecules, residues, and atoms as the one on the command line? If every lambda window has a checkpoint file, then you might want to do something like log a warning that the system that was passed via the command line will be ignored. (In this particular case there is probably no need to validate that it's the same as each of the checkpoints.) |
…er of molecules, number of residues and number of atoms, returning true only if all match. This funciton is used to check equivalence between checkpoint files and the original system
The latest commit adds a check that checkpoint files match the original system.
|
Good point. I think it's okay to just perform the check when the checkpoint is loaded, rather than pre-loading everything. I think this would essentially never happen and we'd only be ensuring that we don't run any windows where the system doesn't match. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mb2055. Looks good!
Adds support for restarting from checkpoints created by previous SOMD2 simulations. This is achieved with a new
restart
config option, ifTrue
SOMD2 will search theoutput-directory
for pre-existing configs and checkpoint files, using the checkpoint files as a starting point for the current run.Features/requirements to note:
runtime
should be set as the total desired runtime for each window. If, for example, runtime is set to 5ns and the pre-existing checkpoint file already has a runtime of 2ns, the simulation run from restart will run for 3ns (to bring the total time to 5ns).config.yaml
is present in theoutput-directory
then configurations will be checked against that of thelambda=0
checkpoint file.output-directory
may be deleted. A list of files will be given as a logger warning and the user will be required to press enter to confirm deletion. This button press requirement can be turned off by addingsupress-overwrite-warning
to the command line (or settingsupress-overwrite-warning = True
if using python API).