Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Get allowlist answers from both breadcrumb data and answers #2838

Merged
merged 2 commits into from
Mar 13, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 38 additions & 5 deletions editor.planx.uk/src/pages/FlowEditor/lib/analyticsProvider.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ export const AnalyticsProvider: React.FC<{ children: React.ReactNode }> = ({
) {
if (shouldSkipTracking()) return;

const allowListAnswers = getAllowListAnswers(breadcrumb);
const allowListAnswers = getAllowListAnswers(nodeId, breadcrumb);
if (!allowListAnswers) return;

await publicClient.mutate({
Expand Down Expand Up @@ -491,17 +491,50 @@ export const AnalyticsProvider: React.FC<{ children: React.ReactNode }> = ({
}

function getAllowListAnswers(
nodeId: string,
breadcrumb: Store.userData,
): Record<string, any>[] | undefined {
): Record<string, unknown>[] | undefined {
const answers = getAnswers(nodeId);
const data = getData(breadcrumb);

const allowListAnswers = [...answers, ...data];
if (!allowListAnswers.length) return;

return allowListAnswers;
}

/**
* Extract allowlist answers from user answers
* e.g. from Checklist or Question components
*/
function getAnswers(nodeId: string) {
Mike-Heneghan marked this conversation as resolved.
Show resolved Hide resolved
const { data } = flow[nodeId];
const nodeFn: string = data?.fn || data?.val;
if (!nodeFn || !ALLOW_LIST.includes(nodeFn)) return [];

const answerIds = breadcrumbs[nodeId]?.answers;
if (!answerIds) return [];

const answerValues = answerIds.map((answerId) => flow[answerId]?.data?.val);

// Match data structure of `allow_list_answers` column
const answers = [ {[nodeFn]: answerValues } ];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a brief look at the analytics_logs I think as the data structure there'll now be a mismatch between current and old data i.e. array of values vs array of objects:

Screenshot 2024-03-01 at 19 03 45 Screenshot 2024-03-01 at 19 04 20

Although seeing as we haven't been tracking allow_list_list I wonder if we can just crack on?

Otherwise I think it would be possible to update historical data using the node_fn value as the key 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - good point. First thought would be to ignore this as it's a new column, however having data in different formats in the column would prove tricky to query so we might as well do this properly!

There's more logs populated than expected already on production, so I think clearing older logs is not really a solution here.

I've tested the following SQL locally which will convert the column. Could you please review and manually test the SQL below @Mike-Heneghan before we run on production (once this change has gone to prod).

To select rows with the "old" data format -

SELECT
    id,
    node_fn,
    node_type,
    allow_list_answers
FROM
    analytics_logs
WHERE
    jsonb_typeof(allow_list_answers) = 'array'
    AND jsonb_typeof(allow_list_answers -> 0) = 'string';

And then this to update "old" → "new" -

UPDATE analytics_logs
SET allow_list_answers = (
    SELECT jsonb_agg(jsonb_build_object(key, value))
    FROM (
        SELECT 
            node_fn AS key, 
            jsonb_array_elements_text(allow_list_answers) AS value
        FROM analytics_logs
        WHERE jsonb_typeof(allow_list_answers) = 'array' AND 
              jsonb_typeof(allow_list_answers->0) = 'string'
    ) AS subquery
)
WHERE jsonb_typeof(allow_list_answers) = 'array' AND 
      jsonb_typeof(allow_list_answers->0) = 'string';

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Daf, I think there's something not quite right based on my local testing.

The SELECT works as expected:

Screenshot 2024-03-04 at 12 07 38

Although fetching the same records again post UPDATE:

Screenshot 2024-03-04 at 12 10 54

It appears it doesn't group the values against the key and also looks like every record has become an amalgamation of the rest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! That doesn't match my results - I maybe just had a simple use case 😅 Let me take another look!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! This sort of stuff is really fiddly, please let me know when you want me to retest 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this maybe has the same issue based on my testing 😅

Screenshot 2024-03-06 at 11 46 18 Screenshot 2024-03-06 at 11 47 40

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I get it sorry - I'm just testing on a single record 🤦

Probably just needs and id=id clause. Taking a look and thanks for testing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey sorry for the slow turnaround here @Mike-Heneghan

Clause added and highlighted with comment -

UPDATE analytics_logs
SET allow_list_answers = (
    SELECT jsonb_agg(jsonb_build_object(node_fn, subquery.values))
    FROM (
        SELECT jsonb_agg(value) AS values
        FROM (
            SELECT 
              node_fn,
              jsonb_array_elements_text(allow_list_answers) AS value
            FROM analytics_logs inner_log
            WHERE 
              -- New clause to match ids
              inner_log.id = analytics_logs.id AND
              jsonb_typeof(inner_log.allow_list_answers) = 'array' AND 
              jsonb_typeof(inner_log.allow_list_answers->0) = 'string'
        ) AS subquery_inner
    ) AS subquery
)
WHERE 
  jsonb_typeof(allow_list_answers) = 'array' AND 
  jsonb_typeof(allow_list_answers->0) = 'string';

Getting expected results here, but as always would love another pair of eyes 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working as expected for me ✅

I regret not adding an order by to my query used to check results 😅

Although, comparing the updates are made as expected for me and it doesn't affect the data already in the new structure 🥳

Before:

Screenshot 2024-03-13 at 12 14 58

After:

Screenshot 2024-03-13 at 12 15 44

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah perfect - honestly thanks again for being so on-point with testing here, really helped! 🙌


return answers;
}

/**
* Extract allowlist answers from breadcrumb data
* e.g. data set automatically by components such as DrawBoundary
*/
function getData(breadcrumb: Store.userData) {
const dataSetByNode = breadcrumb.data;
if (!dataSetByNode) return;
if (!dataSetByNode) return [];

const answerValues = Object.entries(dataSetByNode)
.filter(([key, value]) => ALLOW_LIST.includes(key) && Boolean(value))
.map(([key, value]) => ({ [key]: value }));

if (!answerValues.length) return;

return answerValues;
}

Expand Down
Loading