Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataTransformer] Updating metadata does not behave as expected #407

Open
UnravelSports opened this issue Jan 18, 2025 · 6 comments · May be fixed by #422
Open

[DataTransformer] Updating metadata does not behave as expected #407

UnravelSports opened this issue Jan 18, 2025 · 6 comments · May be fixed by #422

Comments

@UnravelSports
Copy link
Contributor

Let's assume I want to update the default Opta pitch coordinates, something like:

PITCH_LENGTH = 104
PITCH_WIDTH = 60

  dataset = opta.load(
      f24_data="some_file",
      f7_data="some_other_file"
  )
  dataset = dataset.transform(
      to_pitch_dimensions=MetricPitchDimensions(
          x_dim=Dimension(
              -1 * PITCH_LENGTH / 2, PITCH_LENGTH / 2
          ),
          y_dim=Dimension(
              -1 * PITCH_WIDTH / 2, PITCH_WIDTH / 2
          ),
          pitch_length=PITCH_LENGTH,
          pitch_width=PITCH_WIDTH,
          standardized=False,
      )
  )

If I now look at the actual coordinates, they are transformed correctly. However, the dataset.metadata does not seem to get updated accordingly.

We have something in place inside the DataTransformer that should handle this, namely at line #403.

metadata = replace(
        dataset.metadata,
        pitch_dimensions=to_pitch_dimensions,
        orientation=to_orientation,
    )

Unfortunately, this replace does not seem to actually replace anything, and the new metadata object still contains the old pitch dimensions.

@probberechts
Copy link
Contributor

probberechts commented Jan 26, 2025

I was trying to wrap my head around why this does not work. If I run your code example, it indeed does not update the metadata. On the other hand, in the tests it seems to work fine. See for example here. Any idea why this fails?

It's not related to tracking vs event data. Same issue when transforming Stats Perform tracking data.

@UnravelSports
Copy link
Contributor Author

UnravelSports commented Jan 26, 2025

The only thing I can think of is that replace only works if @dataclass(frozen=True). Because there is 1 other instance in the code base where we use replace as well, namely also in dataset.py. And here I think it does actually work, and this class if Frozen.

The reason I say this is because I think the replace function is used to replace frozen objects, but I can't remember where I read that.

Why it does work in the tests, I have no idea. I would not be surprised if this is a bug in the dataclass library, simply because it seems to behave irrationally, but I have not gone so far as to check that.

@probberechts
Copy link
Contributor

I've found it! It's the __post_init__ method of the Metadata class that resets the pitch dimensions from the coordinate system. If you want to update the pitch dimensions of a dataset that has a coordinate system, you have to update the coordinate system, not the pitch dimensions. Obviously, this is not very foolproof...

@UnravelSports
Copy link
Contributor Author

Ha, it's always obvious in hindsight! I read something in de the docs the hinted at this last night, but you solved it already before I could say anything 👍

@probberechts
Copy link
Contributor

What result did you actually expect to get from the code above?

  • Update metadata.coordinate_system to a KloppyCoordinateSystem with MetricPitchDimensions1
  • Update metadata.coordinate_system to a generic CoordinateSystem with MetricPitchDimensions
  • Remove the coordinate system from the metadata and only set MetricPitchDimensions in the metadata
  • Raise an exception: "This dataset defines a coordinate system, you must transform the coordinate system instead"

Footnotes

  1. This is not what I would prefer: it would require many changes to the code and allow changing the predefined definition of coordinate systems

@UnravelSports
Copy link
Contributor Author

I think answer 2.

I load event data with the default coordinates, then I transform them to a custom coordinate system with pitch dimensions in meters. Then I expect the dataset.metadata to reflect this change in coordinate system.

probberechts added a commit to probberechts/kloppy that referenced this issue Feb 1, 2025
Fixes PySport#407: Transforming the pitch dimensions of a dataset with a coordinate
system will now change the dataset's coordinate system to
a `CustomCoordinateSystem` with the specified pitch dimensions.

As a side effect, makes it easier to create a custom coordinate system:

```
from kloppy.domain import CustomCoordinateSystem

my_coordinate_system = CustomCoordinateSystem(
	origin=...,
	vertical_orientation=...,
	pitch_dimensions=...,
)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants