Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert work-around removal from #719 and properly fix #711 #770

Merged
merged 1 commit into from
Jan 28, 2025

Conversation

Vampire
Copy link
Contributor

@Vampire Vampire commented Jan 20, 2025

No description provided.

Copy link

google-cla bot commented Jan 20, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Collaborator

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what your primary goal is, but excluded tasks seem to still be broken: gradle/gradle#21179 . So I don't know why you are removing the workaround.

@Vampire
Copy link
Contributor Author

Vampire commented Jan 21, 2025

I don't remove any work-around.
What is broken with my PR is also broken without my PR.
#719 already did revert the work-around just did not clean up properly.
All I remove is useless unneeded code, given the changes in #719,
You probably have to die one death, either CC works (fixed by #719) or excluding tasks works without error, I don't think there is a middle-way.
Because when CC entry is stored all providers that do not have task dependencies are eagerly evaluated and the result stored to the CC entry, all providers that do have task dependencies are evaluated at execution time as their result depends on the task execution result.

@Vampire
Copy link
Contributor Author

Vampire commented Jan 22, 2025

Also, since 0521fe7 the original reproducer does not reproduce with the work-around removed.

@Vampire
Copy link
Contributor Author

Vampire commented Jan 22, 2025

The original reproducer still reproduces if one wants to processResource though.

I played with the code and if you really want to support excluding random tasks and configuration cache, there is an even uglier work-around variant of the original work-around by introducing a dummy no-action task that is used as a task dependency for ProtobufExtract tasks to delay the provider evaluation to execution time. The only thing that will not work then is to exclude that no-op dummy task. But all other tasks can be excluded and also the configuration-cache test added with #719 passes.

If you want that version instead, I can change this PR to reintroduce the original work-around including that extension.

@Vampire Vampire changed the title Cleanup left-overs of #719 Revert work-around removal from #719 and properly fix #711 Jan 22, 2025
@Vampire
Copy link
Contributor Author

Vampire commented Jan 22, 2025

I pushed that new work-around version here that I mentioned

@Vampire Vampire force-pushed the master branch 2 times, most recently from 3f60efc to 62cbb4e Compare January 22, 2025 18:38
@Vampire
Copy link
Contributor Author

Vampire commented Jan 22, 2025

The CodeNarc violation is because addTasksForSourceSet now has an additional parameter.
If you like the changes otherwise, please tell me how you want that resolved in your codebase.

Copy link
Collaborator

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm paging all this back into my memory. I see more now what your earlier commit did. The whims of the configuration cache still partly elude me, such that I still don't follow how #713 made it better, but I also suspect that AGP is partly to blame. I do follow this change though, that the task delays resolution.

In general I'm not too partial to supporting excluding tasks, but I also understand there's performance impacts to parts of the plugin behavior.

Codenarc is complaining about too many parameters to a method? @SuppressWarnings and codenarc-disable don't seem to work for our configuration, so let's just disable that check.

diff --git a/config/codenarc/codenarc.xml b/config/codenarc/codenarc.xml
index 12e743c..7feebb8 100644
--- a/config/codenarc/codenarc.xml
+++ b/config/codenarc/codenarc.xml
@@ -77,6 +77,7 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
     <exclude name="CyclomaticComplexity"/>
     <exclude name="NestedBlockDepth"/>
     <exclude name="MethodCount"/>
+    <exclude name="ParameterCount"/>
   </ruleset-ref>
   <ruleset-ref path='rulesets/unused.xml'/>
   <ruleset-ref path='rulesets/unnecessary.xml'>

@Vampire
Copy link
Contributor Author

Vampire commented Jan 22, 2025

The whims of the configuration cache still partly elude me, such that I still don't follow how #713 made it better

Well, that's actually not that complex once you got the grip, attribute-/variant-aware resolution is a hard nut to crack.
The explanation has a decent length, but only because I try to make it crystal clear.
If you don't care, just don't read it, it is just explanation of the situation if you or someone else is interested. :-)

Configuration cache tries to save as much time as possible up to execution phase.

Resolving external dependencies? => Does not need to be done at execution time as it would always give the same result, so can be done before saving CC entry and result persisted.

Artifact transforms of external dependencies? => Does not need to be done at execution time, as it would always give the same result, so can be done before saving CC entry and result persisted.

A provider chain that evaluates to a certain value? => Does not need to be done at execution time, as it should always give the same result, so all the providers are evaluated to their final value at CC entry store time and the result persisted.

There is one exception though.
A provider can bear task dependency information.
That means the value of the provider depends on the execution result of a task, like e.g. an output property of a task.
So to calculate such a provider, the task already has have to be executed.
In such a case the provider is not calculated at CC entry store time, but at execution time, so after the dependency task was executed already.
Think for example about a task provider of a task that produces a file that is then mapped to a string which is the content of the file.
Closures needed for this provider chain then need to be serialized along to the CC entry so that it then can be deserialized for execution time evaluation.

#713 / #719 were effectively a revert of the workaround for #550 just leaving out some bits which were then useless, i.e. what I originally changed in this PR.
Before #713 / #719 there was inputFiles.filter { false } and providerFactory.provider { ... inputFiles.files mapped to some unpacked files and so on ... }.
inputFiles is a ConfigurableFileCollection which "is a provider of files" and also can have task dependencies, for example if you do from(someTask) or from(someProviderWithTaskDependency), and filter called on it preserves the task dependencies, so the first part still has the task dependency and thus is evaluated at task execution time where then all contents are filtered away.
providerFactory.provider { ... } is always a provider without task dependencies, so this provider is calculated at CC entry store time which causes all accessed files to be CC inputs and the result of the calculation being stored to the CC entry which is incorrect though, as the files produced by dependency tasks are either not produced yet or if they are found then they are the stale ones from the last execution.

#713 / #719 reverted it back to the idiomatic state where inputFiles.getElements().map { ... input files mapped to some unpacked files and so on ... } was used. inputFiles still is the ConfigurableFileCollection with the task dependencies. elements of it is a provider of filesystemlocations that preserves the task dependencies, and map on that also preserves the task dependencies. As now again the provider is bearing task dependency information its evaluation is again delayed to execution time and not prematurely calculated at CC entry store time.

Now the problem is, with the excluded tasks. If you call map on a provider that has task dependencies and then try to evaluate the result, Gradle has a detection that fails. Remember the example I mentioned above? A provider of a task that produces a file on which map is called to read the file contents and map that as result to get a Provider<String>. If that resulting provider is executed before the dependency task is executed the first time, the file will not be present, if it was run before the result will be the stale file contents of the last time the task was executed. If you exclude the task in question, then that task did not execute before and the recognition yells at you correctly.

With the work-around that was in place before #713 / #719 and with this PR in its current state merged you basically just disable this warning, telling Gradle "I know for sure that everyone that is using my plugin is a very knowledgable Gradle expert that is fully and totally aware of what he is doing when he excludes a task and it is fine to work on non-existent or stale files in this case".

You have the same problem in other places like when copying task outputs around and so on, you just do not get an error for that yet. Or when resolving a configuration that uses artifact transforms and depends on projects in the same build, during configuration time. Even without task exclusion these artifact transforms are then executed on the absent or stale task outputs and this result is persisted and used later on, even after the respective task was actually executed already later on. And additionally the task dependency in that configuration is marked as fulfilled even if the task was not executed, so if no other part of the build depends on that task, it additionally is like the task was excluded.

Actually Gradle folks even have plans to deprecate enabled = false and -x and probably also -a again then. Though I hope that they change mind in that regard, because for Gradle experts that know what they are doing and how and when to properly use them, they are invaluable, even if the average user should probably better not use those switches.

I said "not yet" because after gradle/gradle#11100 is done - and I actually hope it comes as it would make many inherently broken and unreliable builds fail instead so they can properly be fixed, just hopefully without sacrificing -x, -a, ... - then also the old or new work-arounds will fail as then also the inputFiles.filter { false }most likely will fail. Maybe the work-around could then be changed to insteaddependsOn` the task if that is still possible as mentioned in the thoughts of that ticket, but that is a topic for when that thing materialized and details are clear and not just a plan.

@ejona86 ejona86 merged commit fb71f67 into google:master Jan 28, 2025
11 checks passed
@ejona86
Copy link
Collaborator

ejona86 commented Jan 28, 2025

Doh, sorry for the delay merging. I missed rougsig's very-speedy review 😄.

Thank you for the explanation, I've not gotten through it yet, but I do plan to. I definitely believe the configuration cache is easier than attribute-/variant-aware resolution; I've found that resolution to be avoided even before getting into its complexities, since there's games being played with it (that Gradle contributed to Guava) and there's weak/broken diagnostics.

@Vampire
Copy link
Contributor Author

Vampire commented Jan 28, 2025

?!?
I would never disable Gradle Module Metadata.
It is a great thing to have and very powerful.
I've not yet seen a case where I would want to disable it, but if some lib has broken metadata just use a component metadata rule to fix the broken metadata ad-hoc.

With GMM you can e.g. have different dependencies depending on Java version you use (e.g. when you need a dependency providing classes that were present in JDK 8 but are now removed and in a separate dependency when using JDK 17), or can resolve different variants for different use-cases explicitly, or can use have a proper separation which dependencies you need downstream for compilation vs. runtime to reduce compilation times, increase up-to-dateness cases, and cache entry reusability, ....

But well, your project, your decision. :-)

@ejona86
Copy link
Collaborator

ejona86 commented Jan 28, 2025

As a library GMM doesn't add any value and can only cause problems. Library consumers will often use Maven so there's no way for us to leverage its capabilities (not that I've seen a compelling case outside of Kotlin native, which is a different discussion). As a library we don't choose the version of Java, api/implementation already provides a good separation of dependencies, and I'm very doubtful it could offer up-to-dateness and cache reusability improvements (or rather, if it can, I don't want to know about it because it is going to be too complicated and fickle). We'd benefit most if our dependencies used Gradle so their implementation dependencies become runtime-only deps in the pom, instead of compile deps like they are now. But the metadata just means more testing and opportunities for something to bite us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants