Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move cent conversions upstream to source package #101

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

fivetran-reneeli
Copy link
Contributor

@fivetran-reneeli fivetran-reneeli commented Dec 18, 2024

PR Overview

This PR will address the following Issue/Feature: #100

This PR will result in the following new package version: v0.16.0

Breaking change as customers using only the staging models will experience changes in values in currency-related fields.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

Breaking Changes

  • This package assumes that amount-based fields, which as raw values are represented in the smallest denomination in Stripe, are cent-based. This PR shifts the existing conversion from cents to dollars to further upstream. Previously, currency-related fields were converted in downstream models, but now have been converted directly in staging models. Since currency-related fields now have different values, this is a breaking change.

To disable this default conversion, refer to the README on disabling the stripe__amount_divide variable.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

See validation link in internal ticket

If you had to summarize this PR in an emoji, which would it be?

💃

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for working through this PR. I have a few comments and requests. In particular, I also would like to discuss some more why we only divided by 100 in a few particular places in previous versions of this package. I want to make sure we aren't missing anything before we apply an update like this.

CHANGELOG.md Outdated
Comment on lines 3 to 6
## Breaking Changes
- This package assumes that amount-based fields, which as raw values are represented in the smallest denomination in Stripe, are cent-based. This PR shifts the existing conversion from cents to dollars to further upstream. Previously, currency-related fields were converted in downstream models, but now have been converted directly in staging models. Since currency-related fields now have different values, this is a breaking change.

To disable this default conversion, refer to the [README]((https://github.com/fivetran/dbt_stripe/blob/main/README.md#disabling-cent-to-dollar-conversion)) on disabling the `stripe__amount_divide` variable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See notes from the source PR to make similar updates here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, we should highlight each field in the end models which are seeing this update. This will drastically change the outputs if customers were not previously aware. We should make this obvious to customers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this change the outputs if the values of the fields remain the same? For example, total_sales in customer_overview would still retain the same value, just the division occurs farther upstream. I can make a callout for the fields where the code has changed though. Which, by the way, you make a good point in that the division didn't occur across all end models originally. Will circle back on that in the other comment.

README.md Outdated
Comment on lines 223 to 230
#### Disabling Cent to Dollar Conversion

Amount-based fields, such as `amount` and `net`, are typically displayed in the smallest denomination (e.g., cents for USD). By default, these values are automatically converted to dollars by dividing by `100.0`. To disable this conversion and retain the values in their smallest denomination, set the `stripe__amount_divide` variable to `False` as shown below:

```yml
vars:
stripe__amount_divide: False
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See source PR notes to make the same updates here.

README.md Show resolved Hide resolved
packages.yml Outdated
Comment on lines 2 to 7
# version: [">=0.13.0", "<0.14.0"]

- git: https://github.com/fivetran/dbt_stripe_source.git
revision: feature/standardize_cent_conversions
warn-unpinned: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to swap before merge

coalesce(transactions_grouped.total_gross_transaction_amount/100.0, 0) as total_gross_transaction_amount,
coalesce(transactions_grouped.total_fees/100.0, 0) as total_fees,
coalesce(transactions_grouped.total_net_transaction_amount/100.0, 0) as total_net_transaction_amount,
coalesce(transactions_grouped.total_sales, 0) as total_sales,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we only divide by 100 for the int_stripe__account_daily and stripe__customer_overview models? We didn't do this function in the other (arguably more popular) end models such as balance transaction. Are we missing something here? Should we consider retaining the smallest unit by default but instead allow the customer to enable it if they want?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hmm. I did some digging back into the commit histories of the models. I think this is product of when each model was introduced. For instance, the int_stripe__account_daily was created with the /100 in place. But some of the older models never had it so my guess is there wasn't a retroactive update applied to those.

Are you saying to set the conversion to False by default? I guess that would make more sense from an impact perspective if more fields weren't ever converted than there were. That way, most of the models wouldn't be impacted than if the conversion was set to True.

If that is the case, this means we keep the changes made to int_stripe__account_daily and stripe__customer_overview, but I will update the documentation and run script accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation here! This is really interesting and a bit confusing why we would apply this divide by 100 approach for a select few models, but not others.

Given this I would request we take the following approach. We have the default state of the stripe__convert_values variable be false. We then apply an addition conditional to these end models with the divide by 100 to only divide by 100 if the variable is false. This way we can ensure this update will be backwards compatible and not cause major breaking changes. So these end models would look like the following.

{{ 'coalesce(transactions_grouped.total_refunds/100.0, 0)' if not var('stripe__convert_values', false) else coalesce(transactions_grouped.total_refunds, 0)' }} as total_refunds,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated and added a note to the CHANGELOG.

@fivetran-reneeli fivetran-reneeli linked an issue Jan 8, 2025 that may be closed by this pull request
4 tasks
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-reneeli thanks for these edits, a few more comments in the review. Thanks!

coalesce(transactions_grouped.total_gross_transaction_amount/100.0, 0) as total_gross_transaction_amount,
coalesce(transactions_grouped.total_fees/100.0, 0) as total_fees,
coalesce(transactions_grouped.total_net_transaction_amount/100.0, 0) as total_net_transaction_amount,
coalesce(transactions_grouped.total_sales, 0) as total_sales,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation here! This is really interesting and a bit confusing why we would apply this divide by 100 approach for a select few models, but not others.

Given this I would request we take the following approach. We have the default state of the stripe__convert_values variable be false. We then apply an addition conditional to these end models with the divide by 100 to only divide by 100 if the variable is false. This way we can ensure this update will be backwards compatible and not cause major breaking changes. So these end models would look like the following.

{{ 'coalesce(transactions_grouped.total_refunds/100.0, 0)' if not var('stripe__convert_values', false) else coalesce(transactions_grouped.total_refunds, 0)' }} as total_refunds,

README.md Outdated

```yml
vars:
stripe__convert_values: False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my other comment, let's change this default to false and ensure the docs show how to change this to true by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

CHANGELOG.md Outdated
@@ -1,3 +1,16 @@
# dbt_stripe v0.16.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to update this based on the new variable being set to false now. We can likely remove the breaking change impacts but mention that the value of this variable will change in the future and they should be made aware, but if they want to set the variable to true then they should know the impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

downgraded the version to 0.15.2 and made changes to the wording

@fivetran-reneeli fivetran-reneeli linked an issue Jan 14, 2025 that may be closed by this pull request
4 tasks
@fivetran-reneeli
Copy link
Contributor Author

@fivetran-joemarkiewicz ready for re-review! I made the /100 disabled by default, with the exception of where I added the backward compatibility. I also added a consistency test to customer_overview and switched the run script to test for stripe__convert_values: true. And updated the relevant documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed during a team huddle recently and team brought up a good point about how this may be misleading if a customer did intend to turn off the convert_values but instead still this model includes the division because of the inverse logic. So this may not be the best approach. I can workshop another strategy

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this is looking really good! I have a few comments below before approving

@@ -1,6 +1,27 @@
# dbt_stripe version.version
# dbt_stripe v0.15.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the same update suggestions from the source PR here as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

README.md Outdated
@@ -74,7 +75,7 @@ Include the following stripe package version in your `packages.yml` file:
```yaml
packages:
- package: fivetran/stripe
version: [">=0.15.0", "<0.16.0"]
version: [">=0.16.0", "<0.17.0"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This updated isn't necessary anymore now that this is a patch update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

models/docs.md Outdated
@@ -1,2 +1,6 @@
{% docs source_relation -%} The source where this data was pulled from. If you are making use of the `union_schemas` variable, this will be the source schema. If you are making use of the `union_databases` variable, this will be the source database. If you are not unioning together multiple sources, this will be an empty string.
{%- enddocs %}

{% docs convert_values -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually isn't necessary since this exists in the source. So you can remove this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

select
{% for col in cols %}
{% if not loop.first %}, {% endif %}
floor(sum({{ col }})) as summed_{{ col }} -- floor and sum is to keep consistency between dev and prod aggs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using our internal data I'm running into a test failure with this consistency test. Can you investigate why this test is failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to test with that schema-- it came down to incorrectly placed logic in the customer_overview model, I fixed it and the validation tests now all pass with that schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants