DNM: Retool how validators register and are called #89

thockin · 2025-01-04T17:25:51Z

DNM (yet): this builds on the function-style tag args PR.

Looking at how to not fail silently when tags are used wrong. I started adding more lint rules for things like "you can't use +optional on a type". It struck me that this is something that should be handled more first-class, rather than ex post facto. We will inevitably forget to add the lint rules for some tags.

We have the tag docs struct which includes a "contexts" list, which indicates when a tag is usable. I tried automating that check as a lint rule and it's not ideal as-is. It also highlighted that most of our tag docs are just WRONG (cargo culted) and that the context definitions we have are not really great.

Examples:

optional/required should only be on types
eachVal/eachKey claim they can be used on list values (they can't today)
listMapKey claims it can be used on list values (I don't think that makes sense?)
subfield claims it can be used on list values (I don't think it can)
unionDiscriminator/unionMember claim they can be used on list values (I don't think that makes sense?)

Lastly, each ExtractValidations() call doesn't have enough information to DTRT even if we fixed the above.

This PR started with expanding the contexts to distinguish struct-fields from list/map keys/values. It evolved into a different way of approaching tags.

Rather than each plugin's ExtractValidations() being responsible for checking that it is used in an appropriate context (and getting new args to indicate the extra information), I tried to do something a little more "scaffolding" centric. It changes the init sequence to register validators which include the tag string and contexts (basically the docs struct, but more fine-grained), but also having generic code look up the tags, ensure correct context, and then call a method to produce the specific FunctionGen objects.

Basically, centralize the tag-processing. Now generation will fail with errors like:

failed to generate validations: k8s.io/code-generator/cmd/validation-gen/output_tests/optional.T1: tag "k8s:optional" cannot be specified on type definitions

or

failed to generate validations: field k8s.io/code-generator/cmd/validation-gen/output_tests/options.T1.S2: tag "k8s:ifOptionDisabled": takes exactly 1 argument

...but the optional plugin didn't have to do the work. Some plugins still need to do work, like if they only apply to strings, but that seems OK to me.

This PR includes all my commits, even when later commits undo or change part of it, so readers can see the thought process. In particular, this centralized tag handling is insufficient for things like union which needs to run on the TYPE, but the tags are on FIELDs.

Now there are "tag validators" and "type validators" (which run AFTER child fields). Union has tag handlers that are used to validate context (e.g. "+union only applies to struct fields") and accumulate info, and a type handler which uses the accumulated info.

It may be possible to convert things like eachVal to "regular validators.

I wanted to move registry to a different package, but to avoid a circular dep, it has to become 3 packages, and it didn't seem worthwhile.

thockin · 2025-01-06T01:10:16Z

For giggles I added some WIP commits to make +eachVal a "regular" tag. To do that, I had to do listMapKey and listType, too.

This probably could have been simpler, and list-map could be treated "specially" (with info passed to validators), but I wanted to see if I could do it functional-style.

It sorta works. There are some issues around pointerness, still, and it's not as understandable as I'd like, but I think you can get the idea. Basically there's a family of list-oriented validators which work together to collect info about keys and then extract old values when we can.

The big remaining problem is that tags are discovered in sort-order, which means that +eachVal is run before +listMapKey. I see two ways to solve it:

internally -- collect a list of funcs to call from the new FieldValidator, which runs after all tags.
let tags declare a sort order for discover (they already do for execution). E.g. "schematic" vs "content" (words TBD).

If we like the approach, we can see how subfield would work, too.

thockin · 2025-01-06T17:20:55Z

For more giggles, I made +k8s:subfield(ls)=+k8s:eachVal2=+k8s:format=ip-sloppy work - that's running on each value of a subfield of list type.

I can't yet reverse that, until I convert subfield to a regular tag.

This defines a parser for tags which takes a single Go-style identifier argument (but leaves room for better parsing).

Instead of registering a factory-function and expecting the resulting type to extract tags itself, we now register a "tag descriptor" which has enough information to centrally check some contextual requirements (e.g. where can this tag be used) and then call the plugins to process the extracted value. This avoid all the plugins having to do validation for context, though they may still have to do some handling, e.g. if they care about lists vs scalars. This commit leaves the tree in a working state. It only does the `optional` tag, and subsequent commits will do more. At the end, the legacy code will be removed.

This required handling embedded tags, since those are used all over the place. This commit also simplifies TagContext.

Also, since tag descriptors cover one tag, the new Docs() method only needs to return one value.

Also split tag-descriptor code to a new file

TypeValidators will be called on every type and should be comparatively rare. This is needed to make the union validation work. Specifically: 1) TagValidators are triggered to accumulate details about which fields are in which unions, as well as about discriminators. Emit no validations. 2) TypeValidator emits the final validation. It's a bit more spread out than before but it allows the centralization of tag handling. More renames will follow, to simplify the UX.

This was exposed by a bug where I ended up clobbering rather than appending. No reason I can see not to do it this way always.

I thought splitting them was cleaner, but it's not.

It's more verbose but clearer.

It makes more sense to sort here, rather than later (when emitting) because: Consider a type or field with the following comments: ``` // +k8s:validateFalse="111" // +k8s:validateFalse="222" // +k8s:ifOptionEnabled(Foo)=+k8s:validateFalse="333" ``` Tag extraction will retain the relative order between 111 and 222, but 333 is extracted as tag "k8s:ifOptionEnabled". We iterate that map (in a random order). When it reaches the emit stage, the "ifOptionEnabled" part is gone, and we have 3 FunctionGen objects, all with tag "k8s:validateFalse". They are in a non-deterministic order because of that map iteration. If we sort them now, we don't have enough information to do something smart, unless we look at the args, which are sort of opaque to us. Sorting it earlier means we can sort "k8s:ifOptionEnabled" against "k8s:validateFalse". All of the records within each of those is relatively ordered, so the result here would be to put "ifOptionEnabled" before "validateFalse" (lexicographical is better than random).

...to enable eachKey and listMap tags, hopefully

This requires doing listMap at the same time. It seems to build for a few test cases, but lots of debris to fix, still.

thockin assigned jpbetz, aaron-prindle and yongruilin Jan 4, 2025

thockin changed the title ~~Retool how validators register and are called~~ DNM: Retool how validators register and are called Jan 4, 2025

thockin force-pushed the validation-gen_tag-descriptors branch from 5165747 to ca520e4 Compare January 6, 2025 01:02

thockin force-pushed the validation-gen_tag-descriptors branch from ca520e4 to 4e424a7 Compare January 6, 2025 17:19

thockin added 22 commits January 6, 2025 16:18

Re-vendor gengo for functional tag parsing

89ec024

Use function-style tags for options

30c8385

This defines a parser for tags which takes a single Go-style identifier argument (but leaves room for better parsing).

Remove stale comments about "k8s:foo" names

a82a29f

Fix typo: schema -> scheme

4780d9b

Remove unused parameter

628aaff

Convert validateTrue/False/Error to tag descriptor

34dcb90

This required handling embedded tags, since those are used all over the place. This commit also simplifies TagContext.

Get rid of old TagContext

a16676b

Fix docs to use shared info

4267375

Also, since tag descriptors cover one tag, the new Docs() method only needs to return one value.

Rename TagContext2 -> TagContext

5395c46

Clean up documentation around the tag registry

9d81f01

Also split tag-descriptor code to a new file

Convert required and forbidden

efd3749

Convert format, maxLength, maxItems

6d90761

We don't need the TagDescriptor type assertion

bb66088

Convert enum

b98ff9c

Cleanup: always use validations.Add

a853163

This was exposed by a bug where I ended up clobbering rather than appending. No reason I can see not to do it this way always.

Rename TagRegistry -> ValidatorRegistry

07a2b3a

Rename TagDescriptor -> TagValidator

cde0cd5

Rename TagContext -> Context

c3e6520

Rename TagScope -> Scope

7e5e1e3

Rename TagDoc.Contexts -> TagDoc.Scopes

f0aa791

thockin added 11 commits January 6, 2025 16:26

Merge tags.go and validators.go

fab8ef5

I thought splitting them was cleaner, but it's not.

Rename tag-validator types to all be ...Validator

1afc29c

It's more verbose but clearer.

Make Init() take a config type like before

9a4482a

Convert ifOption{En,Dis}abled tag

336d9e5

Remove older validator code, now dead

f1dfa45

Rename things to older, simpler names

d101c3a

Hide registry impl behind an interface

5c30bd2

Clean up tag-docs, add args and usage

4eed632

WIP: add field-validators...

32afb21

...to enable eachKey and listMap tags, hopefully

WIP: make eachVal a regular tag

2d63642

This requires doing listMap at the same time. It seems to build for a few test cases, but lots of debris to fix, still.

thockin force-pushed the validation-gen_tag-descriptors branch from 4e424a7 to 2d63642 Compare January 7, 2025 00:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNM: Retool how validators register and are called #89

DNM: Retool how validators register and are called #89

thockin commented Jan 4, 2025 •

edited

Loading

thockin commented Jan 6, 2025

thockin commented Jan 6, 2025 •

edited

Loading

DNM: Retool how validators register and are called #89

Are you sure you want to change the base?

DNM: Retool how validators register and are called #89

Conversation

thockin commented Jan 4, 2025 • edited Loading

thockin commented Jan 6, 2025

thockin commented Jan 6, 2025 • edited Loading

thockin commented Jan 4, 2025 •

edited

Loading

thockin commented Jan 6, 2025 •

edited

Loading