Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: when passing unstructured array to structured data type, values get broadcast #25

Open
jacanchaplais opened this issue Jun 12, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@jacanchaplais
Copy link
Owner

If (N, 4)-dimensional numpy array gets written to HdfEventWriter.pmu, instead of converting the columns to the named fields, each column gets broadcasted over the x, y, z, e elements, resulting in a 4x increase in disk usage.

That is, from this:

array([[-2.52392785e+01,  4.17925601e-02,  1.63704453e+01,
         3.00834579e+01],
       [-1.31029150e+01,  2.16167656e-02,  8.49904537e+00,
         1.56179590e+01],
       [ 4.83393040e+02,  4.55137104e+01,  3.82795685e+02,
         6.18282166e+02],
       ...

what gets stored is this

array([[(-2.52392785e+01, -2.52392785e+01, -2.52392785e+01, -2.52392785e+01),
        ( 4.17925601e-02,  4.17925601e-02,  4.17925601e-02,  4.17925601e-02),
        ( 1.63704453e+01,  1.63704453e+01,  1.63704453e+01,  1.63704453e+01),
        ( 3.00834579e+01,  3.00834579e+01,  3.00834579e+01,  3.00834579e+01)],
       [(-1.31029150e+01, -1.31029150e+01, -1.31029150e+01, -1.31029150e+01),
        ( 2.16167656e-02,  2.16167656e-02,  2.16167656e-02,  2.16167656e-02),
        ( 8.49904537e+00,  8.49904537e+00,  8.49904537e+00,  8.49904537e+00),
        ( 1.56179590e+01,  1.56179590e+01,  1.56179590e+01,  1.56179590e+01)],
       [( 4.83393040e+02,  4.83393040e+02,  4.83393040e+02,  4.83393040e+02),
        ( 4.55137104e+01,  4.55137104e+01,  4.55137104e+01,  4.55137104e+01),
        ( 3.82795685e+02,  3.82795685e+02,  3.82795685e+02,  3.82795685e+02),
        ( 6.18282166e+02,  6.18282166e+02,  6.18282166e+02,  6.18282166e+02)]
        ...

This is an obvious bug! Patch it by converting the (N, 4) array as a view compatible with the HDF5 dataset.

@jacanchaplais jacanchaplais added the bug Something isn't working label Jun 12, 2024
@jacanchaplais jacanchaplais self-assigned this Jun 12, 2024
jacanchaplais added a commit that referenced this issue Jun 12, 2024
Prevent broadcasting when passing unstructured data #25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant