Refactor: template parser answer list support #2211

chrismclarke · 2024-02-22T18:43:34Z

PR Checklist

PR title descriptive (can be used in release notes)

TODO

If merging after Chore(core)!: Dependency major updates #2157 will need to change sentry import accordingly

Description

Update parser to parse answer_lists (previously would happen at runtime)
Add spec tests for template parser
Improve methods for row name auto-generation
Update parser cache version (will force reprocess all sheets next sync)

Author Notes

Been a while since taking a closer look at the template parser. A few things noted while working on the PR:

Notes

answer_lists need to be defined as local variables with name including _list in them somewhere. Not sure if this was previously documented anywhere, I haven't changed the handling just noted that it is the case. See more info about special row name handling in questions section below
Following the change all content repos should see git diffs as answer_lists populated fully (not just string partial representation)
One limitation of parsing answer lists in the compiler is harder to determine whether a named list is actually a list or a reference to something else. E.g. {name: "my_answer_list", value: "@local.some_other_answer_list"}. In order to handle this case I've added a check explicitly for ; when specifying template lists. That may break content if a sheet defined a list with only one value and forgot to include, e.g. answer_list local variable value defined as:

"name: value_1 | text: my text"

as opposed to

 "name: value_1 | text: my_text; name: value_2 | text: my text 2"

There is currently a default to handle rows without a name which is just setting the name as the row type. This vaguely works if there is just one text or image type row in a template and nothing needs to modify or refer back, however felt quite fragile in general. I've added a minor update so that by default we now also add a suffix to the name derived from the row number being processed, e.g. if the sheet has unnamed text on row 8 the generated name should be text_8. Is this something useful, or would it be better to remove entirely and replace with some quality-control check to ensure all rows have an author-defined name.
With the changes above there may be small content diffs when syncing repos if any do contain non-named rows

Questions

It appears that parameter lists only ever support simple key-value pairs during parsing (e.g. answer_list: @local.some_list), with additional support for key-only properties (e.g. input_allowed would be implied as input_allowed: 'true'). Do we have any runtime cases of more complex use cases that we might want to reconsider evaluating in the parser? E.g. parameter lists where the value isn't just a simple string but more complex object?
There seems support template_group row type, is this something we still use? (don't have strong memory of it)
We're currently passing the exclude_from_translation property explicitly with every row entry. Given that this info isn't ever used at runtime it might be worth considering exporting this separately (e.g. a single json file of excluded translations). Possibly a small follow-up issue (might be best to coordinate with translation workflow)
Templates have special handling for any variables named _list (convert to array) or _collection (convert to json key-value pairs). This is on top of specific handling for columns ending _list (e.g. condition_list), _collection, _date, or specifically action_list.
Is this something documented somewhere? Do we know if things like local name and/or column name _collection handling is still used?

Dev Notes

I've left the runtime answer list item parse method in the code for now for any cases where the parsing may not have been done correctly in advance (I don't think it should happen but probably better to be safe for now.. maybe if read in from a global or cell within a data_list (as opposed to full data list row)). I've also added an explicit call to sentry to detect if the code is executed to track whether we can safely remove in the future

I've done my best to add spec tests to cover all the cases covered explicitly within the template parser methods in their current state. I'm not sure if it might make sense to further try to refine/improve as follow-up (e.g. remove template_group handling which I think must be deprecated by now). Spec tests can be run via:

yarn workspace scripts test

The code also includes a placeholder method to run specific QC checks on templates, however I stopped short of adding as the PR was already growing quickly. Might be worth reviewing in the future possible checks worth having (e.g. I had planned to include one to ensure all parameter lists that included answer lists pointed at local variables with correct _list in name)

Git Issues

Closes #2210

Screenshots/Videos

Tests added - see full test details in spec file

…ub.com/IDEMSInternational/parenting-app-ui into refactor/answer-list-parser

…ing-app-ui into refactor/answer-list-parser

esmeetewinkel

answer_lists need to be defined as local variables with name including _list in them somewhere.

It seems that the local variable needs to end in _list, containing it as a substring is not enough: A variable answer_list_1 is not parsed differently, but answer_1_list is. (template question of plh_kids_tz deployment)

There is currently a default to handle rows without a name which is just setting the name as the row type. This vaguely works if there is just one text or image type row in a template and nothing needs to modify or refer back, however felt quite fragile in general. I've added a minor update so that by default we now also add a suffix to the name derived from the row number being processed, e.g. if the sheet has unnamed text on row 8 the generated name should be text_8. Is this something useful, or would it be better to remove entirely and replace with some quality-control check to ensure all rows have an author-defined name.

I like this suggestion in general, makes sense. I did find the following issues with the current code:

This also applies to data_items. I'm understanding that a begin_data_items row "expects" a name column value (populating it with a unique default when missing). This is a bit confusing, since a begin_items row breaks when there is anything in the name column. (template comp_data_items)
When rows sit nested in a display group the proposed default row names are numbered within the scope of the display group, which doesn't make sense to me. When referring to these rows in a parent template the display group is invisible (i.e. as opposed to begin_nested_template there is no begin_nested_display_group) so we need the default row names to be unique within the scope of the template, I'd say. (template feature_display_group)

I don't see file changes like this in any regular template with a begin_items row, but it does show up for a generated template that has a begin_items row (sheet example_generator_temp generating example_generator_output_w_1)

There seems support template_group row type, is this something we still use? (don't have strong memory of it)

Doesn't ring a bell to me... So definitely we don't use it anymore, not sure we ever did.

chrismclarke · 2024-03-15T15:37:18Z

It seems that the local variable needs to end in _list, containing it as a substring is not enough: A variable answer_list_1 is not parsed differently, but answer_1_list is.

@esmeetewinkel
Thanks for looking through all these cases

You're right, I can see the default parser only converts special fields (here meaning sheet columns) depending on their name ending

  if (typeof value === "string") {
          if (field.endsWith("_list")) {
            this.row[field] = parseAppDataListString(value);
          }
          if (field.endsWith("_collection")) {
            this.row[field] = parseAppDataCollectionString(this.row[field]);
          }
          if (field.endsWith("action_list")) {
            this.row[field] = this.row[field]
              .map((actionString) => parseAppDataActionString(actionString))
              .filter((action) => action !== null);
          }
        }
        // convert google/excel number dates to dates (https://stackoverflow.com/questions/16229494/converting-excel-date-serial-number-to-date-using-javascript)
        if (field.endsWith("_date")) {
          if (typeof this.row[field] === "number") {
            this.row[field] = parseAppDateValue(this.row[field]);
          }
        }

It is then only at a later stage that the template parser handles based on row names, and supports the text included anywhere in the name

 if (row.name?.includes("_list")) {
        row.value = this.parseTemplateList(row.value);
      }
      if (row.name?.includes("_collection") && row.value && typeof row.value === "string") {
        row.value = parseAppDataCollectionString(row.value);
      }

So that means right now for any data_list the column must end in _list to be processed as expected, and for any template that defines inline then the row must contain _list somewhere in the name.

Do you think this distinction makes sense/is sensible, or better to also allow data_lists to convert if the column has _list anywhere in the name? I don't think it was considered to carefully when first implemented so open to suggestion

chrismclarke · 2024-03-15T16:03:42Z

This also applies to data_items. I'm understanding that a begin_data_items row "expects" a name column value (populating it with a unique default when missing). This is a bit confusing, since a begin_items row breaks when there is anything in the name column.

I'm not 100% sure if it breaks or not, because I think there's still fixes applied in #2220 that would need merge to work properly.
So it might be worth retesting after - I actually think assigning a name to data items should be a requirement because otherwise it would be possible to have 2 sets of data items with children that have duplicate name references, but can see post #2220

I did pick up one extra issue while testing on the debug repo though related to navigation-bar component which I've pushed a fix for

chrismclarke · 2024-03-15T18:28:27Z

2. When rows sit nested in a display group the proposed default row names are numbered within the scope of the display group, which doesn't make sense to me. When referring to these rows in a parent template the display group is invisible (i.e. as opposed to begin_nested_template there is no begin_nested_display_group) so we need the default row names to be unique within the scope of the template, I'd say.

Yeah it's a bit of a tricky one. I (think) I can understand what you mean - when referring to the rows from within the same template they can be identified independent of the display group container (this is the case and what you mean, right??) - and so makes sense enforcing uniqueness across the template . The challenge here is that when it comes to the rendering process they are still rendered as child rows of the display group container (which handles the row layout logic).

So I think the anomaly here is being able to refer to the rows directly, say nested_text_1, instead of as my_display_group.nested_text_1. I think it probably wouldn't be very easy to adapt the case for display groups without diving deeper into the overall templating system, so might just be a limitation/bug/feature to be aware of for now...

I'm not sure whether it would be worth the time or not to create an example sheet with the sole purpose of reviewing generated row names or not... It's tricky added all these cases to the test specs as depends somewhat on different levels of processing (e.g. items and data_items apply certain manipulations at runtime so they can iterate over dynamic data while display_groups are state and managed more at compile time)

esmeetewinkel · 2024-03-21T16:20:36Z

So that means right now for any data_list the column must end in _list to be processed as expected, and for any template that defines inline then the row must contain _list somewhere in the name.

Do you think this distinction makes sense/is sensible, or better to also allow data_lists to convert if the column has _list anywhere in the name? I don't think it was considered to carefully when first implemented so open to suggestion

Okay, thanks for checking and clarifying. I can see reasons for either convention (it could be handy for list_1, list_2, etc still to be recognised as lists, but it would be annoying if listenor specialist would be recognised as lists). I'd definitely be in favour of handling the same way for sheet columns and template rows.

chrismclarke · 2024-04-12T19:31:07Z

One issue I encountered: it seems that bumping the cacheVersion of the flowParser is not enough to prompt all sheets to be re-parsed on sync, only the handful of sheets that had new author edits were re-parsed for me. I tried bumping the number myself, which didn't seem to have an effect. What did work was bumping the cacheVersion number as specified in the base processor. After that, syncing makes changes to over a hundred parsed sheets, as expected. I believe this is because the cacheVersion property is used in the BaseProcessor constructor, so the derived class's property does not override it, see this discussion.

Thanks for catching, I've pushed 0089892 which should now force the inheritance correctly by passing to the constructor. Let me know if now working as expected.

Once happy this should be good to merge, I accidentally re-requested review from @esmeetewinkel however I don't think there should be any additional changes to look at

jfmcquade

Thanks for catching, I've pushed 0089892 which should now force the inheritance correctly by passing to the constructor. Let me know if now working as expected.

Can confirm that the cache versioning seems to function as expected, so should be good to merge.

chrismclarke added 8 commits February 21, 2024 09:15

refactor: parse answer_lists in template parser

a12fd5b

chore: remove legacy parsAppDataString

8426652

chore: update template answer list extract

e4e57bc

Merge branch 'fix/combo-box-radio-group-empty-fields' of https://gith…

98a4f0e

…ub.com/IDEMSInternational/parenting-app-ui into refactor/answer-list-parser

chore: wip spec tests

974c65f

feat: template row name generation

fa76774

Merge branch 'master' of https://github.com/IDEMSInternational/parent…

dca4f6c

…ing-app-ui into refactor/answer-list-parser

chore: deprecate parseAnswerListItem

7fdf475

chrismclarke had a problem deploying to debug February 22, 2024 18:43 — with GitHub Actions Failure

chore: code tidying

a87acd3

chrismclarke had a problem deploying to debug February 22, 2024 18:57 — with GitHub Actions Failure

chore: code tidying

b6c9dc8

chrismclarke temporarily deployed to debug February 22, 2024 19:43 — with GitHub Actions Inactive

chrismclarke requested review from jfmcquade and esmeetewinkel February 22, 2024 19:44

chrismclarke marked this pull request as ready for review February 22, 2024 19:44

chrismclarke changed the title ~~Refactor/answer list parser~~ Refactor: template parser answer list support Feb 22, 2024

This was referenced Feb 23, 2024

Chore(core)!: Dependency major updates #2157

Merged

feat: set_item action: optionally target an index or id #2215

Merged

esmeetewinkel reviewed Mar 15, 2024

View reviewed changes

Merge branch 'master' into refactor/answer-list-parser

15dbe48

chrismclarke had a problem deploying to debug March 15, 2024 15:22 — with GitHub Actions Failure

chore: deprecate and move nav bar value list item parse

133f3b6

chrismclarke had a problem deploying to debug March 15, 2024 16:02 — with GitHub Actions Failure

chore: code tidying

46413a7

chrismclarke temporarily deployed to debug March 15, 2024 16:15 — with GitHub Actions Inactive

chrismclarke temporarily deployed to debug April 12, 2024 18:46 — with GitHub Actions Inactive

chrismclarke temporarily deployed to debug April 12, 2024 18:49 — with GitHub Actions Inactive

fix: cache versions

0089892

chrismclarke temporarily deployed to debug April 12, 2024 19:27 — with GitHub Actions Inactive

chrismclarke temporarily deployed to debug April 12, 2024 19:30 — with GitHub Actions Inactive

chrismclarke requested review from jfmcquade and esmeetewinkel April 12, 2024 19:31

chrismclarke mentioned this pull request Apr 12, 2024

Refactor: extract dynamic fields #2294

Merged

3 tasks

Merge branch 'master' into refactor/answer-list-parser

8752f73

esmeetewinkel temporarily deployed to debug April 18, 2024 13:19 — with GitHub Actions Inactive

esmeetewinkel temporarily deployed to debug April 18, 2024 13:20 — with GitHub Actions Inactive

esmeetewinkel removed their request for review April 18, 2024 13:21

esmeetewinkel temporarily deployed to debug April 18, 2024 13:23 — with GitHub Actions Inactive

jfmcquade approved these changes Apr 18, 2024

View reviewed changes

jfmcquade merged commit d208bed into master Apr 18, 2024
10 checks passed

jfmcquade deleted the refactor/answer-list-parser branch April 18, 2024 13:50

This was referenced Apr 18, 2024

content: 1.2.21 IDEMSInternational/app-debug-content#63

Closed

content: 1.2.24 IDEMSInternational/app-debug-content#64

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: template parser answer list support #2211

Refactor: template parser answer list support #2211

chrismclarke commented Feb 22, 2024 •

edited

Loading

esmeetewinkel left a comment •

edited

Loading

chrismclarke commented Mar 15, 2024 •

edited

Loading

chrismclarke commented Mar 15, 2024

chrismclarke commented Mar 15, 2024 •

edited

Loading

esmeetewinkel commented Mar 21, 2024

chrismclarke commented Apr 12, 2024 •

edited

Loading

jfmcquade left a comment

Refactor: template parser answer list support #2211

Refactor: template parser answer list support #2211

Conversation

chrismclarke commented Feb 22, 2024 • edited Loading

TODO

Description

Author Notes

Dev Notes

Git Issues

Screenshots/Videos

esmeetewinkel left a comment • edited Loading

Choose a reason for hiding this comment

chrismclarke commented Mar 15, 2024 • edited Loading

chrismclarke commented Mar 15, 2024

chrismclarke commented Mar 15, 2024 • edited Loading

esmeetewinkel commented Mar 21, 2024

chrismclarke commented Apr 12, 2024 • edited Loading

jfmcquade left a comment

Choose a reason for hiding this comment

chrismclarke commented Feb 22, 2024 •

edited

Loading

esmeetewinkel left a comment •

edited

Loading

chrismclarke commented Mar 15, 2024 •

edited

Loading

chrismclarke commented Mar 15, 2024 •

edited

Loading

chrismclarke commented Apr 12, 2024 •

edited

Loading