Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create serp_events_clients_daily table and view #6339

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

m-d-bowerman
Copy link
Contributor

@m-d-bowerman m-d-bowerman commented Oct 11, 2024

Creates a clients-daily level table for the desktop SERP events data. This table will be further aggregated to a daily level to join with other search revenue and interactions tables.

┆Issue is synchronized with this Jira Task

@m-d-bowerman m-d-bowerman requested review from a team and alekhyamoz October 11, 2024 16:48
@alekhyamoz
Copy link
Contributor

This table won't contain data for the submission date because the base table is typically delayed by one day. may be the submission_date clause needs to be modified

`moz-fx-data-shared-prod.firefox_desktop.serp_events`
WHERE
{% if is_init() %}
submission_date >= '2023-07-14'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please implement this without the templating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So create the table and then backfill instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the template removed

@m-d-bowerman
Copy link
Contributor Author

This table won't contain data for the submission date because the base table is typically delayed by one day. may be the submission_date clause needs to be modified

Ah duh, fixed this, thanks.

@dataops-ci-bot
Copy link

Integration report for "Merge branch 'main' into serp_events_clients_daily"

sql.diff

Click to expand!
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py
--- /tmp/workspace/main-generated-sql/dags/bqetl_search_dashboard.py	2024-10-11 19:35:42.000000000 +0000
+++ /tmp/workspace/generated-sql/dags/bqetl_search_dashboard.py	2024-10-11 19:38:03.000000000 +0000
@@ -370,6 +370,17 @@
         depends_on_past=False,
     )
 
+    search_derived__serp_events_clients_daily__v1 = bigquery_etl_query(
+        task_id="search_derived__serp_events_clients_daily__v1",
+        destination_table="serp_events_clients_daily_v1",
+        dataset_id="search_derived",
+        project_id="moz-fx-data-shared-prod",
+        owner="mozilla/revenue_forecasting_data_reviewers",
+        email=["[email protected]", "[email protected]"],
+        date_partition_parameter="submission_date",
+        depends_on_past=False,
+    )
+
     search_derived__desktop_search_aggregates_by_userstate__v1.set_upstream(
         wait_for_checks__fail_telemetry_derived__clients_last_seen__v2
     )
@@ -525,3 +536,7 @@
     search_derived__search_revenue_levers_daily__v1.set_upstream(
         search_derived__search_dau_aggregates__v1
     )
+
+    search_derived__serp_events_clients_daily__v1.set_upstream(
+        wait_for_firefox_desktop_serp_events__v2
+    )
Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search: serp_events_clients_daily
Only in /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived: serp_events_clients_daily_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/metadata.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/metadata.yaml	2024-10-11 19:33:14.000000000 +0000
@@ -0,0 +1,13 @@
+friendly_name: Serp Events Clients Daily
+description: |-
+  Please provide a description for the query
+owners: []
+labels: {}
+bigquery: null
+workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+references:
+  view.sql:
+  - moz-fx-data-shared-prod.search_derived.serp_events_clients_daily_v1
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/view.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/view.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/view.sql	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search/serp_events_clients_daily/view.sql	2024-10-11 19:30:49.000000000 +0000
@@ -0,0 +1,7 @@
+CREATE OR REPLACE VIEW
+  `moz-fx-data-shared-prod.search.serp_events_clients_daily`
+AS
+SELECT
+  *
+FROM
+  `moz-fx-data-shared-prod.search_derived.serp_events_clients_daily_v1`
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/metadata.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/metadata.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/metadata.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/metadata.yaml	2024-10-11 19:33:11.000000000 +0000
@@ -0,0 +1,24 @@
+friendly_name: SERP Events Clients Daily
+description: |-
+  Aggregation of the desktop SERP Events data to the client-daily level.
+owners:
+- mozilla/revenue_forecasting_data_reviewers
+labels:
+  incremental: true
+  schedule: daily
+  dag: bqetl_search_dashboard
+scheduling:
+  dag_name: bqetl_search_dashboard
+bigquery:
+  time_partitioning:
+    type: day
+    field: submission_date
+    require_partition_filter: true
+    expiration_days: null
+  range_partitioning: null
+  clustering: null
+workgroup_access:
+- role: roles/bigquery.dataViewer
+  members:
+  - workgroup:mozilla-confidential
+references: {}
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/query.sql /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/query.sql
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/query.sql	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/query.sql	2024-10-11 19:30:49.000000000 +0000
@@ -0,0 +1,51 @@
+SELECT
+  submission_date,
+  glean_client_id,
+  legacy_telemetry_client_id,
+  profile_group_id,
+  `moz-fx-data-shared-prod`.udf.normalize_search_engine(search_engine) AS partner,
+  'desktop' AS device,
+  normalized_country_code,
+  normalized_channel,
+  os,
+  browser_version_info.major_version AS browser_major_version,
+  browser_version_info.minor_version AS browser_minor_version,
+  ANY_VALUE(experiments) AS experiments,
+  LOGICAL_OR(ad_blocker_inferred) AS ad_blocker_inferred,
+  COUNT(
+    DISTINCT IF(
+      REGEXP_CONTAINS(sap_source, 'urlbar')
+      OR sap_source IN ('searchbar', 'contextmenu', 'webextension', 'system'),
+      impression_id,
+      NULL
+    )
+  ) AS sap,
+  COUNTIF(is_tagged) AS tagged_sap,
+  COUNTIF(is_tagged AND REGEXP_CONTAINS(sap_source, 'follow_on')) AS tagged_follow_on,
+  SUM(num_ad_clicks) AS ad_click,
+  COUNTIF(num_ads_visible > 0) AS search_with_ads,
+  COUNTIF(NOT is_tagged) AS organic,
+  SUM(IF(NOT is_tagged, num_ad_clicks, 0)) AS ad_click_organic,
+  COUNTIF(num_ads_visible > 0 AND NOT is_tagged) AS search_with_ads_organic,
+    -- serp_events does not have distribution ID or partner codes to calculate monetizable SAP
+  COUNTIF(ad_blocker_inferred) AS sap_with_ad_blocker_inferred,
+  SUM(num_ads_visible) AS num_ads_visible,
+  SUM(num_ads_blocked) AS num_ads_blocked,
+  SUM(num_ads_notshowing) AS num_ads_notshowing,
+  COUNTIF(abandon_reason IS NOT NULL) AS num_abandoned_serp
+FROM
+  `moz-fx-data-shared-prod.firefox_desktop.serp_events`
+WHERE
+  submission_date = DATE_SUB(@submission_date, INTERVAL 1 DAY)
+GROUP BY
+  submission_date,
+  glean_client_id,
+  legacy_telemetry_client_id,
+  profile_group_id,
+  partner,
+  device,
+  normalized_country_code,
+  normalized_channel,
+  os,
+  browser_major_version,
+  browser_minor_version
diff -bur --no-dereference --new-file /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/schema.yaml /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/schema.yaml
--- /tmp/workspace/main-generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/schema.yaml	1970-01-01 00:00:00.000000000 +0000
+++ /tmp/workspace/generated-sql/sql/moz-fx-data-shared-prod/search_derived/serp_events_clients_daily_v1/schema.yaml	2024-10-11 19:30:49.000000000 +0000
@@ -0,0 +1,100 @@
+fields:
+- mode: NULLABLE
+  name: submission_date
+  type: DATE
+- mode: NULLABLE
+  name: glean_client_id
+  type: STRING
+- mode: NULLABLE
+  name: legacy_telemetry_client_id
+  type: STRING
+- mode: NULLABLE
+  name: profile_group_id
+  type: STRING
+- mode: NULLABLE
+  name: partner
+  type: STRING
+- mode: NULLABLE
+  name: device
+  type: STRING
+- mode: NULLABLE
+  name: normalized_country_code
+  type: STRING
+- mode: NULLABLE
+  name: normalized_channel
+  type: STRING
+- mode: NULLABLE
+  name: os
+  type: STRING
+- mode: NULLABLE
+  name: browser_major_version
+  type: NUMERIC
+- mode: NULLABLE
+  name: browser_minor_version
+  type: NUMERIC
+- fields:
+  - mode: NULLABLE
+    name: key
+    type: STRING
+  - fields:
+    - mode: NULLABLE
+      name: branch
+      type: STRING
+    - fields:
+      - mode: NULLABLE
+        name: type
+        type: STRING
+      - mode: NULLABLE
+        name: enrollment_id
+        type: STRING
+      mode: NULLABLE
+      name: extra
+      type: RECORD
+    mode: NULLABLE
+    name: value
+    type: RECORD
+  mode: REPEATED
+  name: experiments
+  type: RECORD
+- mode: NULLABLE
+  name: ad_blocker_inferred
+  type: BOOLEAN
+- mode: NULLABLE
+  name: sap
+  type: INTEGER
+- mode: NULLABLE
+  name: tagged_sap
+  type: INTEGER
+- mode: NULLABLE
+  name: tagged_follow_on
+  type: INTEGER
+- mode: NULLABLE
+  name: ad_click
+  type: INTEGER
+- mode: NULLABLE
+  name: search_with_ads
+  type: INTEGER
+- mode: NULLABLE
+  name: organic
+  type: INTEGER
+- mode: NULLABLE
+  name: ad_click_organic
+  type: INTEGER
+- mode: NULLABLE
+  name: search_with_ads_organic
+  type: INTEGER
+- mode: NULLABLE
+  name: sap_with_ad_blocker_inferred
+  type: INTEGER
+- mode: NULLABLE
+  name: num_ads_visible
+  type: INTEGER
+- mode: NULLABLE
+  name: num_ads_blocked
+  type: INTEGER
+- mode: NULLABLE
+  name: num_ads_notshowing
+  type: INTEGER
+- mode: NULLABLE
+  name: num_abandoned_serp
+  type: INTEGER

Link to full diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants