feat: track the answers to allow_list questions in analytics_logs #2487

Mike-Heneghan · 2023-11-27T17:11:31Z

What:

Passport variable tracking:
- Begin tracking the fn or val of a node
- Store this data against a new column node_fn on the analytics_logs
Answers to allow_list question tracking:
- Defined a new array of questions which are part of the allow_list (i.e. useful for analytics but not sensitive and or PII)
- On component transition check whether the node fn or val is in the allow list and if so update the log with the user's answer

Why:

Currently, node are generally identified in analytics by their node_title and node_type.
This is useful information but there may be situations where it's usefully to additionally know their fn or val if they're assigned
For example the node_title code change but we could easily continue to track via the passport variable which is unlikely to change
Also, some questions in flowd could provide really useful insights to users but also not contain any PII or be invasive to track
These questions are harmless to track and hence could be on the allow_list

Screen recoding

Screen.Recording.2023-11-28.at.12.21.48.mov

Note

At the moment this only really works as a user goes forward in a flow. Do we need to track a user going backwards?
There are analytics_logs records created going backwards so even if a user has completed an answer for proposal.projectType that wouldn't be stored on the analytics_log which was generated on hitting Back
Would it make sense to track this when going backwards?

github-actions · 2023-11-27T17:12:07Z

🤖 Hasura Change Summary compared a subset of table metadata including permissions:

Updated Tables (1)

public.analytics_logs permissions:

	insert	select	update	delete
public	^➕/_➖	^➕/_➖	^➕/_➖

3 added column permissions

	insert	select	update
public	➕ node_fn	➕ node_fn	➕ allow_list_answers

github-actions · 2023-11-27T17:38:53Z

Pizza

Deployed 5e70e0d to https://2487.planx.pizza.

Useful links:

Mike-Heneghan · 2023-11-28T12:45:35Z

Track answers to allow-listed questions for guidance services (non save & return)

jessicamcinchak

Solution is looking good to me, but I am getting some unexpected results when going through a flow in the analytics_logs table on this pizza that I want to flag:

What was happening (db rows here):

The first row for proposal.projectType is right
But then my answer is incorrectly recorded for two Set Values with un-allow-listed fns of proposal.droppedKerbOnly & proposal.treeWorksOnly (is it possible to make mutation param check stricter to ensure it only writes answers for allow-listed fns?)
Then I hit a follow-up, more granular project type question "Select all the works to replace windows or doors" that correctly records answer ["alter.replace.windowsToWindows"]
There's also a number of duplicates (maybe writing on component renders or user interactions within the card rather that whole-card transition??) - I realise this may not be introduced here and a wider caveat of analytics tracking / happy for it to be thought about in a separate PR

As for the larger question in your description about currently only working on "forward" and whether we should track on "back":

My high-level opinion about this would be that we should try to track allow-list answers as similarly as possible to other interactions like "help click" & "input errors" - so if those are tracked on back, then let's try to track answer changes on back too. Helps keep the "explainability" of analytics methodology more straightforward for external consumers in the long run!

editor.planx.uk/src/pages/FlowEditor/lib/analyticsProvider.tsx

...rations/1700845956731_alter_table_public_analytics_logs_add_column_allow_list_answers/up.sql

- As per #2487 (comment)

- As per: #2487 (comment)

Mike-Heneghan · 2023-11-29T12:55:34Z

Solution is looking good to me, but I am getting some unexpected results when going through a flow in the analytics_logs table on this pizza that I want to flag:

What was happening (db rows here):

The first row for proposal.projectType is right

But then my answer is incorrectly recorded for two Set Values with un-allow-listed fns of proposal.droppedKerbOnly & proposal.treeWorksOnly (is it possible to make mutation param check stricter to ensure it only writes answers for allow-listed fns?)

Then I hit a follow-up, more granular project type question "Select all the works to replace windows or doors" that correctly records answer ["alter.replace.windowsToWindows"]

There's also a number of duplicates (maybe writing on component renders or user interactions within the card rather that whole-card transition??) - I realise this may not be introduced here and a wider caveat of analytics tracking / happy for it to be thought about in a separate PR

As for the larger question in your description about currently only working on "forward" and whether we should track on "back":

My high-level opinion about this would be that we should try to track allow-list answers as similarly as possible to other interactions like "help click" & "input errors" - so if those are tracked on back, then let's try to track answer changes on back too. Helps keep the "explainability" of analytics methodology more straightforward for external consumers in the long run!

I'm struggling to replicate the issue 🤔 Looking at the analytics it seems like you were using the find-out-if-you-need-planning-permission flow? Although that doesn't seem to have the questions you encountered.

If the flow you tested with is available would you mind sharing a link to it?

jessicamcinchak · 2023-11-29T13:02:42Z

@Mike-Heneghan I was using Lambeth's FOI on the pizza (FOI flows outside of active council teams likely don't have maintained content) - I'm almost always the only Linux user in pizza logs 🐧

The project type checklist and set values are nested in the permitteddevelopment flow / external portal. The set values will always be hidden from the user journey, but auto-answered in the background based on my project selection - so logging at the same time is expected I suppose (rather than a duplication), but recording the answer is not expected if the fn for that row does not match nor is allow-listed itself if that makes sense ?

Mike-Heneghan · 2023-11-29T13:13:12Z

@Mike-Heneghan I was using Lambeth's FOI on the pizza (FOI flows outside of active council teams likely don't have maintained content) - I'm almost always the only Linux user in pizza logs 🐧

The project type checklist and set values are nested in the permitteddevelopment flow / external portal. The set values will always be hidden from the user journey, but auto-answered in the background based on my project selection - so logging at the same time is expected I suppose (rather than a duplication), but recording the answer is not expected if the fn for that row does not match nor is allow-listed itself if that makes sense ?

Thanks @jessicamcinchak! That's really helpful, I imagine the issue is that I wasn't testing auto-answer questions which will completed really quickly and are more susceptible to this issue! I'll use the lambeth foi to have something a bit more representative to the real behaviour.

- The passport variable of a node is on either the `fn` or `val` optional attribute - It can be useful going forward for this data to be stored - Node title might change but the passport variable might not allowing us to better compare trends if rewording for example - Added a new column to the analytics_log of node_fn to store this value

analytics_logs - Add new column on the analytics_logs to store allow_list_answers in an array - On component transition check if the previous node was an allow list question - If allow list then update the previous log with the answer - This relies on previousCard function which is computationally expensive - This can cause itiming ssues where allowList answer of previous node stored in a log

- The function previousCard is computationally expensive - Rather than repeatedly call previousCard instead store what the node was in state on component transistion

- As per #2487 (comment)

- As per: #2487 (comment)

- Resolves: #2487 (review) - If there's a timing issue avoid storing sensitive data by verifying db node_fn in allowlist - When finding answers remove null values - Note: doesn't resolve edge case that subsequent questions with node_fn may be updated with prevous questions answer

Mike-Heneghan · 2023-12-14T09:35:36Z

editor.planx.uk/src/pages/FlowEditor/lib/analyticsProvider.tsx

+    const { data } = flow[nodeId];
+    const nodeFn = data?.fn || data?.val;
+    if (nodeFn && allowList.includes(nodeFn)) {
+      const answerIds = breadcrumbs[nodeId]?.answers;


I think the reason this doesn't work going backwards is that the current approach has an assumption that we always want to update the analytics log for the previous node with the answer which is stored in the current breadcrumbs.

When we move backwards through a flow the breadcrumbs are removed and the answers for the question go with them.

We could track what the saved answers were when a user loads a node they've already answered in the analytics logs.

Although I'd be tempted not to and only track allow_list answers when a user explicitly hits continue. I think this might be more consistent with the broad approach to user's answers?

Mike-Heneghan self-assigned this Nov 28, 2023

Mike-Heneghan requested a review from a team November 28, 2023 12:43

Mike-Heneghan marked this pull request as ready for review November 28, 2023 12:43

jessicamcinchak reviewed Nov 29, 2023

View reviewed changes

Mike-Heneghan added a commit that referenced this pull request Nov 29, 2023

chore: fix typo

bb4e4fb

- As per #2487 (comment)

Mike-Heneghan added a commit that referenced this pull request Nov 29, 2023

chore: remove unecessary conditional assigment of null

1a06af7

- As per: #2487 (comment)

Mike-Heneghan force-pushed the mh/track-allowlist-on-analytics-logs branch from 6028404 to 1a06af7 Compare November 29, 2023 12:53

Mike-Heneghan marked this pull request as draft November 29, 2023 14:05

Mike-Heneghan added 7 commits December 12, 2023 14:43

refactor: store the previous node in state rather than call previousCard

fec51e8

- The function previousCard is computationally expensive - Rather than repeatedly call previousCard instead store what the node was in state on component transistion

refactor: update type and safely access attribute

b8d45cc

chore: fix typo

a101b4d

- As per #2487 (comment)

chore: remove unecessary conditional assigment of null

e055852

- As per: #2487 (comment)

Mike-Heneghan force-pushed the mh/track-allowlist-on-analytics-logs branch from 1a06af7 to 80512c9 Compare December 13, 2023 12:31

Mike-Heneghan commented Dec 14, 2023

View reviewed changes

Mike-Heneghan mentioned this pull request Dec 21, 2023

feat: track allow list answers #2597

Merged

Mike-Heneghan closed this Jan 10, 2024

Mike-Heneghan deleted the mh/track-allowlist-on-analytics-logs branch March 14, 2024 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: track the answers to allow_list questions in analytics_logs #2487

feat: track the answers to allow_list questions in analytics_logs #2487

Mike-Heneghan commented Nov 27, 2023 •

edited

Loading

github-actions bot commented Nov 27, 2023 •

edited

Loading

github-actions bot commented Nov 27, 2023 •

edited

Loading

Mike-Heneghan commented Nov 28, 2023

jessicamcinchak left a comment

Mike-Heneghan commented Nov 29, 2023

jessicamcinchak commented Nov 29, 2023 •

edited

Loading

Mike-Heneghan commented Nov 29, 2023

Mike-Heneghan Dec 14, 2023

feat: track the answers to allow_list questions in analytics_logs #2487

feat: track the answers to allow_list questions in analytics_logs #2487

Conversation

Mike-Heneghan commented Nov 27, 2023 • edited Loading

What:

Why:

Screen recoding

Note

github-actions bot commented Nov 27, 2023 • edited Loading

Updated Tables (1)

github-actions bot commented Nov 27, 2023 • edited Loading

Pizza

Mike-Heneghan commented Nov 28, 2023

jessicamcinchak left a comment

Choose a reason for hiding this comment

Mike-Heneghan commented Nov 29, 2023

jessicamcinchak commented Nov 29, 2023 • edited Loading

Mike-Heneghan commented Nov 29, 2023

Mike-Heneghan Dec 14, 2023

Choose a reason for hiding this comment

Mike-Heneghan commented Nov 27, 2023 •

edited

Loading

github-actions bot commented Nov 27, 2023 •

edited

Loading

github-actions bot commented Nov 27, 2023 •

edited

Loading

jessicamcinchak commented Nov 29, 2023 •

edited

Loading