-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the precision when converting a decimal128 column to an arrow array #14230
Fix the precision when converting a decimal128 column to an arrow array #14230
Conversation
Uh, should I create this PR against an other branch than the |
The PR should go against |
Thanks @davidwendt. Working on the rebase and conflict. |
e521f25
to
c1d023e
Compare
/ok to test |
/ok to test |
@jihoonson thanks for the PR! Style checks are currently failing. Could you please run our pre-commit hooks locally on your system to fix those? If you don't have those set up, the steps are (I'm making no assumptions about your environment, you may want to change how you install pre-commit if you use conda for instance):
Then commit and push those changes.
That's fine. Since as you say libcudf doesn't have a concept of the precision the only real place it makes sense to include is in the conversion function. I wouldn't sweat the mild duplication.
I agree with your approach, modifying is fine.
At the moment this behavior is not specifically documented. What might be nice is to add a note to the |
68d8ff4
to
ade255f
Compare
/ok to test |
Thanks @hyperbolic2346 and @ttnghia! What will be the next step? Please let me know if there is anything I need to do 🙂 |
/ok to test |
/merge |
Hmm the codecov failed for python/cudf/cudf/io/orc.py, which seems unrelated to my change in this PR. |
/merge |
Description
This PR fixes #13749. As discussed in the issue linked, the precision is unnecessarily being limited to 18 when converting decimal128 to arrow.
Implementation-wise, I wasn't sure where is the best place to define the max precision for decimal types. Given that the decimal types don't store the precision in libcudf, I thought it would be better to not expose the max precision to the outside of to-arrow conversion. However, this led to replicating the definition of max precision across multiple places. Appreciate any suggestion.
Finally, it was suggested in #13749 (comment) to add some tests for round tripping. The existing tests look sufficient to me for round trip tests, so I just modified them instead of adding new tests. Please let me know if we need new tests in addition to them.
I'm also not sure whether any documentation should be fixed. Please let me know.
Checklist