A generic "generate" goal for version-controlled generated files #18235
cognifloyd
started this conversation in
Development
Replies: 2 comments 3 replies
-
Native ability to define simple targets that write back to the non-sandboxed filesystem (with built-in up-to-date-ness checking) would be great! Some use-cases I've encountered:
Some small/bikeshed-y thoughts:
|
Beta Was this translation helpful? Give feedback.
3 replies
-
Just recording the workaround we use for this for now:
E.g. for a file # path/to/BUILD
shell_command(
name="generate-foo",
command="echo abc > foo-generated.txt",
tools=["echo"],
output_files=["foo-generated.txt"],
)
files(name="foo", sources=["foo.txt"])
run_shell_command(
name="write-foo",
# different file name so that they can be compared in the test
command="cp {chroot}/path/to/foo-generated.txt foo.txt",
execution_dependencies=[":generate-foo"]
)
experimental_test_shell_command(
# if this fails, run `pants run path/to:write-foo`
name="check-foo-is-up-to-date",
command="diff foo.txt foo-generated.txt",
execution_dependencies=[":generate-foo", ":foo"],
) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
For a variety of reasons (many of which could probably be summarized as "legacy"), some generated files need to be committed to version control.
Since these files need to be in version control, it would be great if pants could make it easier to manage them. Here are the goals I have for whatever solution we use:
A survey of current pants goals
goal:
export-codegen
At first blush, codegen seems ideal for making generated files. But codegen in pants is not designed for version controlled files.
When pants generates code, it is considered an internal by-product as described in the PR that introduced the
export-codegen
goal:The
export-codegen
goal makes these generated files available indist/
for those who want to:So, we need a different form of codegen in pants for files that should be version controlled.
goals: lint, fmt, fix
Version controlled files are the bread and butter of
lint
,fmt
, andfix
which is opposite ofexport-codegen
which is for files that are not version controlled.Fixers and formatters also run during the
lint
goal so that lint fails if they need to fix or format a file.Code generation of files that are version controlled is remarkably similar to
fmt
andfix
. Files that need to be changed are advertised via thelint
goal. Logically, fmt and fix are very different than codegen - but the process, the UX, and the underlying rules seem quite similar to me.In the StackStorm project, I've shoved my (version controlled) file generation under the
fmt
umbrella even though it's not really a formatter. Logically this generation is also not a fixer. But, one of the benefits of doing this is that: lint will fail if the generator needs to regenerate the file(s) and it will provide an error message that explains how to run./pants fmt ...
to resolve the lint error.I've heard of others that create a "test" that fails if the generated file(s) were not properly regenerated. I really like using
lint
for this to take advantage of an the fine grained caching and reusing the cache for both lint and fmt/fix.goal: generate-lockfiles
Another goal that is very similar to the file generation feature I'm looking for is the
generate-lockfiles
goal. The warnings/errors saying to regenerate can be triggered in many places where rules need to use the lockfiles. So, there is no need to integrate that warning in a lint backend/rule. The goal also provides a fairly ergonomic interface for regenerating those.generate-lockfiles looks up the resolves, translates them into lockfile paths, and then generates the lockfile at that path.
Proposal for a new "generate" goal
If there were a generic "generate" goal, it could go in the opposite direction of
generate-lockfiles
: given the path of a lockfile, look up the resolve associated with that lockfile, and then generate the lockfile at that path.Similarly, that "generate" goal could take a path to another generated file, look up owning target for it, and if that target is explicitly for generated files, run the plugin/subsystem/tool that encompasses the codegen logic for that target.
This goal could be only for non-lockfile generated files. If we did include lockfile generation in this goal, then the owning target would probably be a synthetic target synthesized from the config in pants.toml -- so it is explicitly defined (even though not defined in a BUILD file).
This goal should be modeled on fix/fmt (not modeled on export-codegen). This would extend the lint request/result classes (like fix/fmt do) so that file generation happens in a sandbox during lint, failing lint if something should be regenerated. When lint fails here, the error message would prompt the user to run
./pants generate path/to/generated/file/or/dir
.For a plugin to provide this generate rule, it would either have to add a custom target, or add a field to an existing target to allow flagging it as owning generated files. This would be a codified assumption built into the
generate
goal: all generated files must be owned by exactly 1 file generating target: If unowned, they will not be materialized in the workspace; If owned by more than one file generating targets, raise an error. Generating a file that matches a file glob for that target counts for this.Then, if any other linters can work with a generated files the standard
skip_tool
fields provide an escape hatch if the generated code does not need to meet the same standards as other files. Or, for StackStorm, I can add an extra linter specific to the generated file that checks more than just whether or not it needs to be regenerated.Goal: run
A lot of work has gone into runnable targets like
adhoc_tool
andshell_command
. If combined with the generate goal, those could allow repos to use light-weight run targets and file generator targets instead of writing a medium- or heavy-weight pants plugins.So, a runnable target could be tied to the file generating target such that the adhoc_tool or pex_binary is used to actually do the generation when a user does
./pants generate path/to/generated/file
.For the
run
goal, the UX focuses on the generator script/target. For this "generate" goal, the UX focuses on the files to generate. So they are complementary.goal: go-generate
#16909
This seems very similar and could perhaps be rolled into the new "generate" goal. go-generate references the go modules that define the generation. I'm looking at something that targets the generated files. But, often go:generate commands work within the same module where they are defined, so maybe it is effectively the same thing.
But, this is something that would not run via Lint. So, that would probably be a go backend option to skip running for lint is the goals were combined.
discussion
Actually naming the new "generate" goal is a different discussion. When we discuss that, we'll need to make sure it can be distinguished from the codegen that already exists in pants.
So, would a generic generate goal be helpful for anyone else? Does a generate goal that follows a process similar to fix/fmt make sense for other projects?
Beta Was this translation helpful? Give feedback.
All reactions