Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure payouts file #435

Merged
merged 16 commits into from
Jan 27, 2025
Merged

Restructure payouts file #435

merged 16 commits into from
Jan 27, 2025

Conversation

fhenneke
Copy link
Collaborator

@fhenneke fhenneke commented Nov 21, 2024

This PR is an attempt at implementing #427. The PR does not change the values of final results (up to rounding of floats). It changes the structure of the code and intermediate representations.

As a first step, it separates the computation of different parts of the accounting into different functions.

    # compute individual components of payments
    solver_info = compute_solver_info(
        reward_target_df,
        service_fee_df,
        config,
    )
    rewards = compute_rewards(
        batch_data,
        quote_rewards_df,
        exchange_rate_native_to_cow,
        config.reward_config,
    )
    protocol_fees = compute_protocol_fees(batch_data)
    partner_fees = compute_partner_fees(batch_data, config.protocol_fee_config)
    buffer_accounting = compute_buffer_accounting(batch_data, slippage_df)

Those functions are implemented in separate files

The results of these steps are converted into data frames for solver payments and for protocol and partner fee payments.

    # combine into solver payouts and partner payouts
    solver_payouts = compute_solver_payouts(
        solver_info, rewards, protocol_fees, buffer_accounting
    )
    partner_payouts = (
        partner_fees  # no additional computation required here at the moment
    )

Payout data on transfers and overdrafts is then computed from solver and parner payouts data.

    payouts = prepare_payouts(solver_payouts, partner_payouts, dune.period, config)

The code can be tested to produce the same transfer files as the old code.
Tests have been adapted and cover essentially what was tested before. There is a bit more strictness in the testing of the separate computations of rewards, protocol fees, partner fees, etc.
There is no end-to-end test for payments yet. This should be added at some point.

Future changes could remove data frames from intermediate results. This would make it easier to have correct types and detect and handle missing data. Data for the different parts of the accounting can be changed to use intermediate tables generated by src/data_sync/sync_data.py.

harisang added a commit that referenced this pull request Dec 3, 2024
This PR changes how missing data for reward targets is identified. It
fixes a bug where missing data causes the script to crash.

The issue seems to come from a column without valid values being
misidentified as float column and the missing value is encoded as `NaN`.
This was not identified as missing data since it only checks for `None`.

With this PR, the pandas function `isna` is used to identify missing
data.

A similar approach was used in #435.

Since local tests are still running, I created this as draft PR. I will
remove the draft status once the local run is successful.

---------

Co-authored-by: Haris Angelidakis <[email protected]>
@fhenneke fhenneke force-pushed the restructure_payouts branch from e2e51c8 to ffcadef Compare January 20, 2025 11:58
@fhenneke fhenneke marked this pull request as ready for review January 20, 2025 13:30
@fhenneke fhenneke changed the title [Draft] Restructure payouts Restructure payouts file Jan 20, 2025
@fhenneke fhenneke requested review from harisang and bram-vdberg and removed request for harisang and bram-vdberg January 20, 2025 13:40
) -> DataFrame:
"""Compute buffer accounting per solver"""

# validate batch rewards and quote rewards columns
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment seems wrong

src/fetch/partner_fees.py Outdated Show resolved Hide resolved
src/fetch/rewards.py Outdated Show resolved Hide resolved

def compute_rewards(
batch_data: DataFrame,
quote_rewards: DataFrame,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would rename this to quote_data, to make it more uniform and also because there are no rewards stored in that dataframe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we could be more explicit by writing

  • batch_data_per_solver
  • quote_data_per_solver

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am torn a bit between using more descriptive names (e.g. batch_data_per_solver) and more general names (e.g. batch_data).

On the one hand, in the end, the general code should not depend on what data exactly is needed to compute rewards, it just depends on batch_data and we choose what that entails. And the structure of that part of the code was originally intended to end up as something like

rewards = compute_rewards(orderbook, dune, config)

where all the fetching and computing is abstracted away.

On the other hand, we will not end up with a clean version any time soon. So we might as well be as explicit as possible for now.

]


def compute_rewards(
Copy link
Contributor

@harisang harisang Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking whether we should express rewards in their "native currency", i.e., batch rewards in native token and quote rewards in COW, and then at the final step, when we build the payout, actually do the conversions.

This would mean dropping the primary_reward_cow column, as well as not pass the exchange_rate as a parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, the intention was to have a data frame rewards containing all information for computing reward payments. This also works smoothly with the rest of the payments code. It does completely decouple the reward computation and the creation of payments.

I thought about relaxing this to being able to compute all reward information from rewards and reward_config. This would allow dropping the strange 'reward_token_address' column. But it would still require exchange rates to be part of that data frame.

src/fetch/solver_info.py Outdated Show resolved Hide resolved
for service_fees_flag in solver_info["service_fee"]
]

if not solver_info["solver"].is_unique:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen as part of this function, or the only way this can happen is if the input table itself (reward_targets) contains duplicates?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll will have another look at that. I do remember needing it at some point but those queries have changed a bit since then.
Potentially, this can be changed to an assert, and the actual changing of data happens in the dune fetching, as for other queries.
At some point, all the processing might take place in compute_solver_info, but for now we should keep it consistent.

src/fetch/payouts.py Outdated Show resolved Hide resolved
"0x"-prefixed hex representation of address of a solvers bonding pool.
solver_name: str
Name of a solver.
service_fees : DataFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is off with indentation here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the format a bit to make it clearer what is an input and what is a column.

src/fetch/solver_info.py Outdated Show resolved Hide resolved
@fhenneke
Copy link
Collaborator Author

I added some documentation for most functions which were added or changed.

I am trying to stick to a numpy/scipy style docstring style. Additionally, I added explanations for all expected columns.

src/fetch/payouts.py Outdated Show resolved Hide resolved
solver_payouts["reward_token_address"] = (
solver_payouts["reward_token_address"]
.fillna(
"0x0000000000000000000000000000000000000001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ever needed btw?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could alternatively add an assert here as otherwise it the dataframe doesn't make sense (unless the corresponding row contains all zeros).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a bit of code smell. I added it because without this default there were errors. But I do not exactly know why.

The reward token address is a required field in the RewardAndPenaltyDatum class and will be turned into an Address. The address is, however, never used for solvers who do not get a reward. In principle, whenever a solver settled an auction or provided a quote, they should have an entry in rewards and thus filling the in defaults should not be required.

I will have to have another look (or follow up on this in another PR).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this default value and adapted one of the tests accordingly.

Copy link
Contributor

@harisang harisang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! There is one small comment to be addressed but not a deal breaker

@fhenneke fhenneke added the E:9.1 Streamline mainnet payouts process See https://github.com/cowprotocol/pm/issues/80 for details label Jan 27, 2025
@fhenneke fhenneke merged commit 58fdf14 into main Jan 27, 2025
6 checks passed
@fhenneke fhenneke deleted the restructure_payouts branch January 27, 2025 09:34
@github-actions github-actions bot locked and limited conversation to collaborators Jan 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
E:9.1 Streamline mainnet payouts process See https://github.com/cowprotocol/pm/issues/80 for details
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants