-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading structs into higher-dimensional Arrays #9
Comments
Indeed, the problem here is that currently there is no support for so-called jagged arrays from multiple leaves in The structure information of the leaves is encoded in the julia> f["arrays"].fLeaves
UnROOT.TObjArray("", 0, Any[UnROOT.TLeafI
fName: String "nInt"
fTitle: String "nInt"
fLen: Int32 1
fLenType: Int32 4
fOffset: Int32 0
fIsRange: Bool false
fIsUnsigned: Bool false
fLeafCount: UInt32 0x00000000
fMinimum: Int32 0
fMaximum: Int32 0
, UnROOT.TLeafD
fName: String ""
fTitle: String "[6]"
fLen: Int32 6
fLenType: Int32 8
fOffset: Int32 0
fIsRange: Bool false
fIsUnsigned: Bool false
fLeafCount: UInt32 0x00000000
fMinimum: Float64 0.0
fMaximum: Float64 0.0
, UnROOT.TLeafD
fName: String ""
fTitle: String "[2][3]"
fLen: Int32 6
fLenType: Int32 8
fOffset: Int32 0
fIsRange: Bool false
fIsUnsigned: Bool false
fLeafCount: UInt32 0x00000000
fMinimum: Float64 0.0
fMaximum: Float64 0.0
]) I think that parsing this string is the only way to assemble the correct structure, so the first thing is to dynamically create the correct type (Julia Here is a very basic (and ugly) hack to get the first entry: julia> @UnROOT.io struct A6DVec
a::Float64
b::Float64
c::Float64
d::Float64
e::Float64
f::Float64
end
julia> UnROOT.unpack(IOBuffer(array(f, "arrays/6dVec"; raw=true)[1]), A6DVec)
A6DVec(1.0, 2.0, 3.0, 4.0, 5.0, 6.0) The I also looked at the file with If you want, I can do it for you (create an issue there) and discuss the details, since I guess I first need to understand all the underlying stuff before I can think of an approach to tackle this in the Julia implementation. Here is a quick example how it fails in In [1]: import uproot4
In [2]: f = uproot4.open("/Users/tamasgal/Downloads/test_array.root")
In [3]: f["arrays"]["6dVec"].array()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-0ed0e403bdf9> in <module>
----> 1 f["arrays"]["6dVec"].array()
~/.virtualenvs/km3net/lib/python3.7/site-packages/uproot4/behaviors/TBranch.py in array(self, interpretation, entry_start, entry_stop, decompression_executor, interpretation_executor, array_cache, library)
1620 interpretation_executor,
1621 library,
-> 1622 arrays,
1623 )
1624
~/.virtualenvs/km3net/lib/python3.7/site-packages/uproot4/behaviors/TBranch.py in _ranges_or_baskets_to_arrays(hasbranches, ranges_or_baskets, branchid_interpretation, entry_start, entry_stop, decompression_executor, interpretation_executor, library, arrays)
516
517 elif isinstance(obj, tuple) and len(obj) == 3:
--> 518 uproot4.source.futures.delayed_raise(*obj)
519
520 else:
~/.virtualenvs/km3net/lib/python3.7/site-packages/uproot4/source/futures.py in delayed_raise(exception_class, exception_value, traceback)
35 exec("raise exception_class, exception_value, traceback")
36 else:
---> 37 raise exception_value.with_traceback(traceback)
38
39
~/.virtualenvs/km3net/lib/python3.7/site-packages/uproot4/behaviors/TBranch.py in basket_to_array(basket)
482 len(basket_arrays[basket.basket_num]),
483 interpretation,
--> 484 branch.file.file_path,
485 )
486 )
ValueError: basket 0 in tree/branch /arrays;1:6dVec has the wrong number of entries (expected 1, obtained 6) when interpreted as AsDtype('>f8')
in file /Users/tamasgal/Downloads/test_array.root
In [4]: f["arrays"]["6dVec"].show()
name | typename | interpretation
---------------------+----------------------+-----------------------------------
6dVec | double | AsDtype('>f8') |
Btw. you can see that both In [5]: f["arrays"].show()
name | typename | interpretation
---------------------+----------------------+-----------------------------------
nInt | int32_t | AsDtype('>i4')
6dVec | double | AsDtype('>f8')
2x3Mat | double | AsDtype('>f8')
In [6]: f["structs"].show()
name | typename | interpretation
---------------------+----------------------+-----------------------------------
nInt | int32_t | AsDtype('>i4')
2x3mat | double[6] | AsDtype("('>f8', (6,))")
In [7]: f["structs/2x3mat"].array()
Out[7]: <Array [[1, 2, 3, 4, 5, 6]] type='1 * 6 * float64'> In [9]: import uproot
In [10]: f = uproot.open("/Users/tamasgal/Downloads/test_array.root")
In [11]: f["structs/nInt"].array()
Out[11]: array([1], dtype=int32)
In [12]: f["structs/2x3mat"].array()
Out[12]: array([[1., 2., 3., 4., 5., 6.]]) |
That's weird—I'm surprised this file is failing in not just one Uproot, but both. It is a type that I've addressed and have tests for. Normally, I'd say that Uproot would be a good model to follow for this case, but apparently not. What it ought to be doing is reading the fixed-size dimensions from the TLeaf fTitle and instructing NumPy to expect I'm glad it got there leaf-list as a NumPy structured array (which was then converted into Awkward). That's a similar thing: the fields are baked into a dtype. It's a different way of reading than data split among branches. For better debugging, you may want to pass |
I just caught up to your last example—that |
Oh dear, you are fast 😅 Alright, thanks! I will dig deeper... I just created an issue in |
Yes, so the same data is saved in |
I just looked at a bit, sadly no. Your branch is extra jagged by the fact that each element is higher-dimensional and they are certainly some kind of nested vector in C++? I've never seen matrix branch in my (LHC experiment) life sorry, but I will be happy to learn from Jim later on! |
Okay, seems reasonable for the However the PS: In experiment there is probably no reason to use higher-dimensional arrays. But for me as a theoretician I am storing a lot of matrix-valued observables from Lattice-simulations and reading these results back into Julia would be really nice 😊 |
ROOT offers different ways to store arrays or even structs in
TBranches
. I added a small ROOT-File (test_array.root.zip) which contains aTTree
called 'arrays' which contains a scalar, a 6d-vector and a 2x3-matrix and anotherTTree
called 'structs' which contains again the same scalar as 'arrays' and a struct which resembles again a 2x3-matrix but stored as a 6d-vec and is defined in the following waystruct m23 {double el[6];};
So the file was created in the following way (may be useful when trying to retrieve data in Julia):
In the ROOT command-line the
TTrees
look like this in the endWhen I try to read this in Julia using the UnROOT package I get the following
It seems that, while
array
is refusing to handle multidimensional objects in aTBranch
, the functionDataFrame
is able to get the scalar and the first entry of each multidimensional object.I tested this with another file and
DataFrame
returns correctly all Rows of scalar-entries and next to it in each case the first element of the higher-dimensional object. So it looks like this:I would like to request to extend the support for multidimensional objects at least for the
array
function.The text was updated successfully, but these errors were encountered: