-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvc exp push: pushing/pulling cache fails if params.yaml not in workspace #9768
Comments
It seems very loosely related to #9754 in that we aren't respecting what's in the revision but instead are using the workspace. |
For the record in case someone encounters this in the context of using Hydra integration: I missed the Hydra part before. That might be easier to handle as a special case in DVC. |
This can also happen during pull, and not only on the first run, since parameters may change frequently (esp with hydra), and any missing parameter in the workspace causes a failure. See https://discord.com/channels/485586884165107732/485596304961962003/1137288948914339860. |
Should we |
Do you mean in the workspace? I don't think it would solve the issue mentioned in discord since it seems like whether the params exist depends on the config in the specific experiment, and it may also mess up the user's workspace. |
Are there any updates on this issue? Garbage collection is also affected: |
@Danila89 I don't have any updates unfortunately. Have you tried adding |
It will, but it is inconvenient. Different experiments have different parameters at |
Upon further investigation, the cache is pushed properly, so this is mostly a UI issue. Brancher always yields the workspace, so an invalid workspace state will throw an error even if a command like Lines 76 to 80 in 7c61463
Can we yield workspace only if it's included in Reproducible exampleset -ux
# Clone repo and setup remotes
REMOTE=/private/tmp/dvcremote
rm -rf params-test params-test-remote
rm -rf $REMOTE
mkdir $REMOTE
git clone -q [email protected]:dberenbaum/params-test.git params-test-remote
git clone -q params-test-remote params-test
cd params-test
dvc remote modify --local default url $REMOTE
# Run experiment with different params and then drop params.yaml
dvc exp run -q -S params=bar
rm params.yaml
# Push experiments and check what's been pushed
dvc exp push origin
dvc exp list origin
tree $REMOTE
# Revert to original params
git restore .
# Run gc and check if garbage collected
dvc exp remove -q --all
dvc gc -f -c -w --all-experiments
tree $REMOTE Here is the output:
@Danila89 tldr This is not likely the same as the issue you are seeing, and dvc is using the correct |
@dberenbaum thank you for the update!
In my project I have:
I would appreciate any advice on what to change in my current workflow to get rid of this invalid |
👍 If you don't intend to keep params.yaml on your branches, this is the way to go.
Can you show the output for this or maybe how to reproduce it? I'm able to run |
Sorry, it happens that the problem reproduces just with DVC 2 and vanishes in DVC 3. |
Sorry, I've checked once again, with dvc 3.33.3 - |
@Danila89 It's happening because |
@dberenbaum Yes it will, thanks in advance! |
Bug Report
Description
If there is no
params.yaml
in the workspace and interpolation is used indvc.yaml
,dvc exp push
will fail to push the cache for the experiments.Reproduce
Fork and clone https://github.com/dberenbaum/params-test and then run:
You will see an error like
ERROR: failed to push cache: failed to parse 'stages.params_test.cmd' in 'dvc.yaml': Could not find 'params'
.Expected
Since
params.yaml
is present in the experiment, it should be possible to push it. Can we interpolate using the state of the experiment rather than the workspace?The text was updated successfully, but these errors were encountered: