Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc exp push: pushing/pulling cache fails if params.yaml not in workspace #9768

Closed
dberenbaum opened this issue Jul 26, 2023 · 15 comments · Fixed by #10128
Closed

dvc exp push: pushing/pulling cache fails if params.yaml not in workspace #9768

dberenbaum opened this issue Jul 26, 2023 · 15 comments · Fixed by #10128
Labels
A: experiments Related to dvc exp bug Did we break something? p2-medium Medium priority, should be done, but less important

Comments

@dberenbaum
Copy link
Collaborator

dberenbaum commented Jul 26, 2023

Bug Report

Description

If there is no params.yaml in the workspace and interpolation is used in dvc.yaml, dvc exp push will fail to push the cache for the experiments.

Reproduce

Fork and clone https://github.com/dberenbaum/params-test and then run:

dvc exp run
rm params.yaml
dvc exp push origin

You will see an error like ERROR: failed to push cache: failed to parse 'stages.params_test.cmd' in 'dvc.yaml': Could not find 'params'.

Expected

Since params.yaml is present in the experiment, it should be possible to push it. Can we interpolate using the state of the experiment rather than the workspace?

@dberenbaum dberenbaum added bug Did we break something? A: experiments Related to dvc exp labels Jul 26, 2023
@dberenbaum
Copy link
Collaborator Author

It seems very loosely related to #9754 in that we aren't respecting what's in the revision but instead are using the workspace.

@daavoo
Copy link
Contributor

daavoo commented Jul 26, 2023

For the record in case someone encounters this in the context of using Hydra integration:

I missed the Hydra part before. That might be easier to handle as a special case in DVC.
And even though it still adds some friction, assuming the Hydra stuff is present in the workspace, you could do dvc exp run --dry as a fallback.
That would create the params.yaml in the workspace and allow to run dvc commands that would previously fail at the interpolation stage

@dberenbaum dberenbaum added this to DVC Jul 31, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Jul 31, 2023
@dberenbaum dberenbaum added the p2-medium Medium priority, should be done, but less important label Aug 1, 2023
@dberenbaum dberenbaum removed the status in DVC Aug 1, 2023
@dberenbaum dberenbaum removed this from DVC Aug 1, 2023
@dberenbaum dberenbaum changed the title dvc exp push: pushing cache fails if params.yaml not in workspace dvc exp push: pushing/pulling cache fails if params.yaml not in workspace Aug 5, 2023
@dberenbaum
Copy link
Collaborator Author

This can also happen during pull, and not only on the first run, since parameters may change frequently (esp with hydra), and any missing parameter in the workspace causes a failure. See https://discord.com/channels/485586884165107732/485596304961962003/1137288948914339860.

@daavoo
Copy link
Contributor

daavoo commented Aug 7, 2023

This can also happen during pull, and not only on the first run, since parameters may change frequently (esp with hydra), and any missing parameter in the workspace causes a failure. See https://discord.com/channels/485586884165107732/485596304961962003/1137288948914339860.

Should we compose_and_dump by default on dvc exp commands?

@dberenbaum
Copy link
Collaborator Author

This can also happen during pull, and not only on the first run, since parameters may change frequently (esp with hydra), and any missing parameter in the workspace causes a failure. See https://discord.com/channels/485586884165107732/485596304961962003/1137288948914339860.

Should we compose_and_dump by default on dvc exp commands?

Do you mean in the workspace? I don't think it would solve the issue mentioned in discord since it seems like whether the params exist depends on the config in the specific experiment, and it may also mess up the user's workspace.

@Danila89
Copy link

Danila89 commented Nov 27, 2023

Are there any updates on this issue? Garbage collection is also affected: dvc gc -a -c --remote remote-name --all-experiments --force fails with similar error

@dberenbaum
Copy link
Collaborator Author

@Danila89 I don't have any updates unfortunately. Have you tried adding params.yaml to your workspace? I think it should work once you have a params.yaml file with the expected parameters.

@Danila89
Copy link

@Danila89 I don't have any updates unfortunately. Have you tried adding params.yaml to your workspace? I think it should work once you have a params.yaml file with the expected parameters.

It will, but it is inconvenient. Different experiments have different parameters at params.yaml (I mean not just different values, but different parameter names as well). And for dvc exp pull or dvc gc I have to get the proper params.yaml first. Using garbage collection when many experiments are deleted becomes literally impossible(

@dberenbaum
Copy link
Collaborator Author

Upon further investigation, the cache is pushed properly, so this is mostly a UI issue. Brancher always yields the workspace, so an invalid workspace state will throw an error even if a command like dvc exp push only relies on non-workspace revs:

dvc/dvc/repo/brancher.py

Lines 76 to 80 in 7c61463

yield "workspace"
revs = revs.copy() if revs else []
if "workspace" in revs:
revs.remove("workspace")

Can we yield workspace only if it's included in revs @iterative/dvc?

Reproducible example
set -ux

# Clone repo and setup remotes
REMOTE=/private/tmp/dvcremote
rm -rf params-test params-test-remote
rm -rf $REMOTE
mkdir $REMOTE
git clone -q [email protected]:dberenbaum/params-test.git params-test-remote
git clone -q params-test-remote params-test
cd params-test
dvc remote modify --local default url $REMOTE

# Run experiment with different params and then drop params.yaml
dvc exp run -q -S params=bar
rm params.yaml

# Push experiments and check what's been pushed
dvc exp push origin
dvc exp list origin
tree $REMOTE

# Revert to original params
git restore .

# Run gc and check if garbage collected
dvc exp remove -q --all
dvc gc -f -c -w --all-experiments
tree $REMOTE

Here is the output:

+ REMOTE=/private/tmp/dvcremote
+ rm -rf params-test params-test-remote
+ rm -rf /private/tmp/dvcremote
+ mkdir /private/tmp/dvcremote
+ git clone -q [email protected]:dberenbaum/params-test.git params-test-remote
+ git clone -q params-test-remote params-test
+ cd params-test
+ dvc remote modify --local default url /private/tmp/dvcremote
+ dvc exp run -q -S params=bar
+ rm params.yaml
+ dvc exp push origin
ERROR: failed to collect 'workspace' - failed to parse 'stages.params_test.cmd' in 'dvc.yaml': Could not find 'params'
Pushed experiment grand-eyas to Git remote 'origin'.
1 file uploaded
+ dvc exp list origin
main:
        grand-eyas
+ tree /private/tmp/dvcremote
/private/tmp/dvcremote
└── files
    └── md5
        └── 03
            └── 6683c57ba049f38cd3dd4c20dd119e

3 directories, 1 file
+ git restore .
+ dvc exp remove -q --all
+ dvc gc -f -c -w --all-experiments
WARNING: This will remove all cache except items used in the workspace and all experiments of the current repo.
Removed 1 objects from repo cache.
No unused 'local' cache to remove.
No unused 'legacy' cache to remove.
Removed 1 objects from remote.
No unused cache to remove from remote.
+ tree /private/tmp/dvcremote
/private/tmp/dvcremote
└── files
    └── md5
        └── 03

3 directories, 0 files

@Danila89 tldr This is not likely the same as the issue you are seeing, and dvc is using the correct params.yaml associated with each individual experiment. If you are seeing a similar error during dvc gc -a -c --remote remote-name --all-experiments --force, it likely means that one of your branches (since -a specifies all branches) is invalid and missing params.yaml entirely.

@Danila89
Copy link

Danila89 commented Nov 29, 2023

@dberenbaum thank you for the update!
It seems that I don't follow the best practise of working with dvc+hydra integration. Could you please clarify:

  1. Should I have params.yaml in every branch in my repo (even if there are no experiments in the branch)?
  2. How should I handle modification of hydra config (some parameters added, some parameters deleted)? What version of the params.yaml (if any) should I keep in each git branch?

In my project I have:

  1. Master branch without params.yaml
  2. Feature branches that are used for staff not directly related to running the experiments. These branched don't have params.yaml
  3. Experiment branches. These branches are created using dvc exp branch and they contain params.yaml
    The experiment workflow is the following:
  4. From master of some feature branch I run dvc exp run
  5. dvc exp branch
  6. dvc exp push
  7. git checkout experiment-branch -f
  8. git push origin experiment-branch

I would appreciate any advice on what to change in my current workflow to get rid of this invalid params.yaml problem.
I seems that for garbage collection I can just drop -a option and that works for me, but dvc exp pull is still not available for me from master branch.

@dberenbaum
Copy link
Collaborator Author

I seems that for garbage collection I can just drop -a option and that works for me

👍 If you don't intend to keep params.yaml on your branches, this is the way to go.

dvc exp pull is still not available for me from master branch.

Can you show the output for this or maybe how to reproduce it? I'm able to run dvc exp pull even when there is no params.yaml in either the branch or the workspace.

@Danila89
Copy link

Danila89 commented Dec 1, 2023

Sorry, it happens that the problem reproduces just with DVC 2 and vanishes in DVC 3.
It outputs the error message like this ERROR: failed to collect 'workspace' - failed to parse 'stages.main.cmd' in 'dvc.yaml': Could not find 'parameters.parameter_name', but it does not prevent normal work of dvc pull/push. Probably this issue could be closed, thank you very much!

@Danila89
Copy link

Danila89 commented Dec 7, 2023

Sorry, I've checked once again, with dvc 3.33.3 - dvc gc still fails without params.yaml in the active branch.
To reproduce: git clone https://github.com/Danila89/dvc_empty.git && cd dvc_empty && dvc gc --all-experiments --force

@dberenbaum
Copy link
Collaborator Author

@Danila89 It's happening because dvc gc always includes the workspace right now, and your workspace is technically in an "invalid" state in that dvc can't read it. We can add a --no-workspace option so that you can gc only the experiments and not the workspace. Would that work for you?

@Danila89
Copy link

Danila89 commented Dec 7, 2023

@dberenbaum Yes it will, thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp bug Did we break something? p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants