-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xrootd read fails for large arrays #1217
Comments
This is using the fsspec-xrootd backend. You can try the old (pre-fsspec) backend by passing The next thing to try is cutting down on the size of the request by limiting the number of entries with This is an XRootD error. A different server might even have a different cut-off for what it considers too big of a request. |
I'll give that a try (I do know that asking for one branch only stops the error from occurring). Not really sure I agree that it's an xrootd error, though: as a user I am asking for data to materialize in my program, and the details of exactly how the network access is being performed or how big chunk sizes are should not be my concern. I would expect the underlying libraries to manage the requests appropriately; I am not sure how e.g. ServiceX can reliably work if we have to implement the recovery logic on our side. |
If this were to be done in Uproot, we'd have to dynamically adjust the request size and resubmit when we see such an error, and have some reasonable give-up policy after a specified number of retries to keep it from spiraling out of control. And if the OSError came from a local file, rather than XRootD, it's not retryable (if an open local file suddenly can't be read, someone must have unplugged the disk or something). Possibly the smallest sections that would need to be retried are: in uproot5/src/uproot/behaviors/TBranch.py Lines 797 to 835 in 13087b0
in uproot5/src/uproot/behaviors/TBranch.py Lines 1034 to 1072 in 13087b0
in uproot5/src/uproot/behaviors/TBranch.py Lines 1787 to 1831 in 13087b0
Maybe it's possible to retry only the There would also need to be a way to change the granularity of the XRootD request while still requesting all the data the user wants. The Source.chunks method doesn't have a way to express that, but maybe the coalesce algorithm does? Unfortunately, coalesce arguments are specified per-FSSpecSource object, but maybe each retry could be a new, more finely granular |
Ah @jpivarski is correct that it does seem that this is an issue with xrootd (related to https://its.cern.ch/jira/browse/ROOT-6639, https://root-forum.cern.ch/t/error-when-streaming-rootfiles-via-xrootd/37783) - it seems dCache servers lie about how big the transfers are that they support, and clients take them at their word for it at their peril. There's a workaround in ROOT, and doing something similar in Python seems to fix the problem. (In the language of the |
Thanks for the update! I've split-out the notes on how a retry mechanism could be implemented to #1219, and I'll close this now. |
Incidentally uproot 5.3.8 seems to fix this partially (I believe due to the read coalescing instead of vector reads). |
I am running into a problem with
arrays()
when reading from xrootd:I've put the file at https://cernbox.cern.ch/s/aLg8sfkoTvgqB9F as well, but obviously this needs to be run over xrootd to reproduce.
Versions: uproot 5.3.7, fsspec 2024.3.1, fsspec_xrootd 0.3.0, xrootd 5.6.9.
The text was updated successfully, but these errors were encountered: