Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column to JCUDF row for tables with strings #10235

Merged

Conversation

hyperbolic2346
Copy link
Contributor

@hyperbolic2346 hyperbolic2346 commented Feb 7, 2022

This is the code for the column to row portion of the string work. This code will convert a table that includes strings into the JCUDF row format. This depends on #10157 and as such, is a draft PR until that is merged. I am putting this up now so people working on reviewing that PR can see where it is headed.

closes #10234

@hyperbolic2346 hyperbolic2346 added feature request New feature or request 2 - In Progress Currently a work in progress Java Affects Java cuDF API. 5 - DO NOT MERGE Hold off on merging; see PR for details labels Feb 7, 2022
@codecov
Copy link

codecov bot commented Feb 7, 2022

Codecov Report

Merging #10235 (c28b107) into branch-22.04 (4596244) will increase coverage by 0.02%.
The diff coverage is 96.86%.

❗ Current head c28b107 differs from pull request most recent head 8455fa3. Consider uploading reports for the commit 8455fa3 to get more accurate results

@@               Coverage Diff                @@
##           branch-22.04   #10235      +/-   ##
================================================
+ Coverage         86.13%   86.16%   +0.02%     
================================================
  Files               139      139              
  Lines             22438    22447       +9     
================================================
+ Hits              19328    19341      +13     
+ Misses             3110     3106       -4     
Impacted Files Coverage Δ
python/dask_cudf/dask_cudf/tests/test_accessor.py 98.41% <ø> (ø)
python/cudf/cudf/core/_base_index.py 85.92% <50.00%> (-0.51%) ⬇️
python/cudf/cudf/core/column/decimal.py 91.30% <73.68%> (-1.01%) ⬇️
python/cudf/cudf/core/column/categorical.py 89.63% <84.61%> (-0.29%) ⬇️
python/cudf/cudf/core/column/string.py 88.91% <94.44%> (+0.64%) ⬆️
python/cudf/cudf/core/column/numerical.py 95.62% <95.83%> (+0.64%) ⬆️
python/cudf/cudf/core/frame.py 91.84% <96.42%> (+0.12%) ⬆️
python/cudf/cudf/_typing.py 94.11% <100.00%> (+0.78%) ⬆️
python/cudf/cudf/api/types.py 89.79% <100.00%> (ø)
python/cudf/cudf/core/column/column.py 89.27% <100.00%> (+0.10%) ⬆️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22a9f35...8455fa3. Read the comment docs.

@hyperbolic2346 hyperbolic2346 force-pushed the mwilson/column-to-row-strings branch from f06c67a to 204b606 Compare February 11, 2022 05:01
@hyperbolic2346 hyperbolic2346 added 3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond non-breaking Non-breaking change and removed 2 - In Progress Currently a work in progress 5 - DO NOT MERGE Hold off on merging; see PR for details labels Feb 11, 2022
@hyperbolic2346 hyperbolic2346 marked this pull request as ready for review February 11, 2022 05:18
@hyperbolic2346 hyperbolic2346 requested a review from a team as a code owner February 11, 2022 05:18
@hyperbolic2346 hyperbolic2346 self-assigned this Feb 11, 2022
Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. A couple of minor nitpicks. Apologies for "the fog".

@hyperbolic2346 hyperbolic2346 force-pushed the mwilson/column-to-row-strings branch from 413e5de to c3c5c67 Compare March 14, 2022 20:51
Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

java/src/main/native/src/row_conversion.cu Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Show resolved Hide resolved
java/src/main/native/src/row_conversion.cu Outdated Show resolved Hide resolved
std::vector<int8_t const *> variable_width_input_data(
variable_data_begin, variable_data_begin + variable_width_table.num_columns());

auto dev_variable_input_data = make_device_uvector_async(variable_width_input_data, stream);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was surprised to find we don't have an iterator version of make_device_uvector_async()

@hyperbolic2346
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 26c22b6 into rapidsai:branch-22.04 Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 4 - Needs Review Waiting for reviewer to review or respond feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Column to JCUDF row for strings
4 participants