-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing state/city in iOS first seen and active users tables prior to 5/17/2024 #6555
Comments
➤ George Kaberere commented: Hey Krzysztof Ignasiak could you take a look at this ticket? |
➤ Krzysztof Ignasiak commented: Hey Alex He , Was just having a look at this. Could you please provide more details around what the problem is that you’re seeing. I just tried running the query and it appears the geo data is there and goes back as far as 2020? I specifically took a look at Chicago: https://sql.telemetry.mozilla.org/queries/104583/source#257586 ( https://sql.telemetry.mozilla.org/queries/104583/source#257586 ) !image-20241220-132922.png|width=1376,height=758,alt="image-20241220-132922.png"! The query I used: SELECT |
➤ Alex He commented: Hi Kik, when I run the query below, it seems to me that the iOS new profiles data starts on May 17, 2024. No city level geo info was available prior to that. !image-20250116-145956.png|width=1747,height=1083,alt="image-20250116-145956.png"! |
➤ Krzysztof Ignasiak commented: Alex He ok, I had a look and it appears it’s the geo_subdivision filter that is causing the data available in your query to be limited to May 2024. I will take a look and see if I can find out when the geo_subdivision information was added. If the data is in the stable ping for dates prior to May 2024 it just means it was not added to our derived ETL pipeline until that date. At this point the question would also become if it is worth the cost of having to recalculate all the baseline tables for this field to become available. This will require both money and time. |
➤ Alex He commented: For my study I joined the table with {{mozdata.org_mozilla_ios_firefox.baseline }}to get the geo_subdivision or state info. The workaround works for now. If it takes huge effort to fix it, I don’t think it is worth it. |
➤ Krzysztof Ignasiak commented: Hey Alex He so I just took a look and indeed it appears this data exists in the stable dataset. The geo_subdivision was only added to our derived ETL (which created tables like baseline_clients_daily and baseline_clients_first_seen) on May 17th funny enough by me: b460280 ( b4602805d9655a170b83ed4c30f401acfbd030c7|smart-link ) If I recall correctly now the reason this was added to allow the global_outages dataset to be more accurate, but at the time there was no need to recalculate historical data. My main concern here is that we’d have to rebuild all of the baseline tables for this data to be available prior to May 2024 in the original table. |
https://sql.telemetry.mozilla.org/queries/102596/source
The state/city info is missing in mozdata.firefox_ios.baseline_clients_first_seen prior to 5/17/2024. Is there any other way we can get the new profiles in a specific city (like Chicago, IL) prior to that date?
┆Issue is synchronized with this Jira Story
┆Attachments: image-20241220-132922.png | image-20250116-145956.png
The text was updated successfully, but these errors were encountered: