diff --git a/CHANGELOG.md b/CHANGELOG.md index bce764f59e3..3cb6caa25ee 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,228 +1,3 @@ -# cuDF 24.02.00 (12 Feb 2024) - -## 🚨 Breaking Changes - -- Remove **kwargs from astype ([#14765](https://github.com/rapidsai/cudf/pull/14765)) [@mroeschke](https://github.com/mroeschke) -- Remove mimesis as a testing dependency ([#14723](https://github.com/rapidsai/cudf/pull/14723)) [@mroeschke](https://github.com/mroeschke) -- Update to Dask's `shuffle_method` kwarg ([#14708](https://github.com/rapidsai/cudf/pull/14708)) [@pentschev](https://github.com/pentschev) -- Drop Pascal GPU support. ([#14630](https://github.com/rapidsai/cudf/pull/14630)) [@bdice](https://github.com/bdice) -- Update to CCCL 2.2.0. ([#14576](https://github.com/rapidsai/cudf/pull/14576)) [@bdice](https://github.com/bdice) -- Expunge as_frame conversions in Column algorithms ([#14491](https://github.com/rapidsai/cudf/pull/14491)) [@wence-](https://github.com/wence-) -- Deprecate cudf::make_strings_column accepting typed offsets ([#14461](https://github.com/rapidsai/cudf/pull/14461)) [@davidwendt](https://github.com/davidwendt) -- Remove deprecated nvtext::load_merge_pairs_file ([#14460](https://github.com/rapidsai/cudf/pull/14460)) [@davidwendt](https://github.com/davidwendt) -- Include writer code and writerVersion in ORC files ([#14458](https://github.com/rapidsai/cudf/pull/14458)) [@vuule](https://github.com/vuule) -- Remove null mask for zero nulls in json readers ([#14451](https://github.com/rapidsai/cudf/pull/14451)) [@karthikeyann](https://github.com/karthikeyann) -- REF: Remove **kwargs from to_pandas, raise if nullable is not implemented ([#14438](https://github.com/rapidsai/cudf/pull/14438)) [@mroeschke](https://github.com/mroeschke) -- Consolidate 1D pandas object handling in as_column ([#14394](https://github.com/rapidsai/cudf/pull/14394)) [@mroeschke](https://github.com/mroeschke) -- Move chars column to parent data buffer in strings column ([#14202](https://github.com/rapidsai/cudf/pull/14202)) [@karthikeyann](https://github.com/karthikeyann) -- Switch to scikit-build-core ([#13531](https://github.com/rapidsai/cudf/pull/13531)) [@vyasr](https://github.com/vyasr) - -## 🐛 Bug Fixes - -- Exclude tests from builds ([#14981](https://github.com/rapidsai/cudf/pull/14981)) [@vyasr](https://github.com/vyasr) -- Fix the bounce buffer size in ORC writer ([#14947](https://github.com/rapidsai/cudf/pull/14947)) [@vuule](https://github.com/vuule) -- Revert sum/product aggregation to always produce `int64_t` type ([#14907](https://github.com/rapidsai/cudf/pull/14907)) [@SurajAralihalli](https://github.com/SurajAralihalli) -- Fixed an issue with output chunking computation stemming from input chunking. ([#14889](https://github.com/rapidsai/cudf/pull/14889)) [@nvdbaranec](https://github.com/nvdbaranec) -- Fix total_byte_size in Parquet row group metadata ([#14802](https://github.com/rapidsai/cudf/pull/14802)) [@etseidl](https://github.com/etseidl) -- Fix index difference to follow the pandas format ([#14789](https://github.com/rapidsai/cudf/pull/14789)) [@amiralimi](https://github.com/amiralimi) -- Fix shared-workflows repo name ([#14784](https://github.com/rapidsai/cudf/pull/14784)) [@raydouglass](https://github.com/raydouglass) -- Remove unparseable attributes from all nodes ([#14780](https://github.com/rapidsai/cudf/pull/14780)) [@vyasr](https://github.com/vyasr) -- Refactor and add validation to IntervalIndex.__init__ ([#14778](https://github.com/rapidsai/cudf/pull/14778)) [@mroeschke](https://github.com/mroeschke) -- Work around incompatibilities between V2 page header handling and zStandard compression in Parquet writer ([#14772](https://github.com/rapidsai/cudf/pull/14772)) [@etseidl](https://github.com/etseidl) -- Fix calls to deprecated strings factory API ([#14771](https://github.com/rapidsai/cudf/pull/14771)) [@davidwendt](https://github.com/davidwendt) -- Fix ptx file discovery in editable installs ([#14767](https://github.com/rapidsai/cudf/pull/14767)) [@vyasr](https://github.com/vyasr) -- Revise ``shuffle`` deprecation to align with dask/dask ([#14762](https://github.com/rapidsai/cudf/pull/14762)) [@rjzamora](https://github.com/rjzamora) -- Enable intermediate proxies to be picklable ([#14752](https://github.com/rapidsai/cudf/pull/14752)) [@shwina](https://github.com/shwina) -- Add CUDF_TEST_PROGRAM_MAIN macro to tests lacking it ([#14751](https://github.com/rapidsai/cudf/pull/14751)) [@etseidl](https://github.com/etseidl) -- Fix CMake args ([#14746](https://github.com/rapidsai/cudf/pull/14746)) [@vyasr](https://github.com/vyasr) -- Fix logic bug introduced in #14730 ([#14742](https://github.com/rapidsai/cudf/pull/14742)) [@wence-](https://github.com/wence-) -- [Java] Choose The Correct RoundingMode For Checking Decimal OutOfBounds ([#14731](https://github.com/rapidsai/cudf/pull/14731)) [@razajafri](https://github.com/razajafri) -- Fix ``Groupby.get_group`` ([#14728](https://github.com/rapidsai/cudf/pull/14728)) [@rjzamora](https://github.com/rjzamora) -- Ensure that all CUDA kernels in cudf have hidden visibility. ([#14726](https://github.com/rapidsai/cudf/pull/14726)) [@robertmaynard](https://github.com/robertmaynard) -- Split cuda versions for notebook testing ([#14722](https://github.com/rapidsai/cudf/pull/14722)) [@raydouglass](https://github.com/raydouglass) -- Fix to_numeric not preserving Series index and name ([#14718](https://github.com/rapidsai/cudf/pull/14718)) [@mroeschke](https://github.com/mroeschke) -- Update dask-cudf wheel name ([#14713](https://github.com/rapidsai/cudf/pull/14713)) [@raydouglass](https://github.com/raydouglass) -- Fix strings::contains matching end of string target ([#14711](https://github.com/rapidsai/cudf/pull/14711)) [@davidwendt](https://github.com/davidwendt) -- Update to Dask's `shuffle_method` kwarg ([#14708](https://github.com/rapidsai/cudf/pull/14708)) [@pentschev](https://github.com/pentschev) -- Write file-level statistics when writing ORC files with zero rows ([#14707](https://github.com/rapidsai/cudf/pull/14707)) [@vuule](https://github.com/vuule) -- Potential fix for peformance regression in #14415 ([#14706](https://github.com/rapidsai/cudf/pull/14706)) [@etseidl](https://github.com/etseidl) -- Ensure DataFrame column types are preserved during serialization ([#14705](https://github.com/rapidsai/cudf/pull/14705)) [@mroeschke](https://github.com/mroeschke) -- Skip numba test that fails on ARM ([#14702](https://github.com/rapidsai/cudf/pull/14702)) [@brandon-b-miller](https://github.com/brandon-b-miller) -- Allow Z in datetime string parsing in non pandas compat mode ([#14701](https://github.com/rapidsai/cudf/pull/14701)) [@mroeschke](https://github.com/mroeschke) -- Fix nan_as_null not being respected when passing arrow object ([#14688](https://github.com/rapidsai/cudf/pull/14688)) [@mroeschke](https://github.com/mroeschke) -- Fix constructing Series/Index from arrow array and dtype ([#14686](https://github.com/rapidsai/cudf/pull/14686)) [@mroeschke](https://github.com/mroeschke) -- Fix Aggregation Type Promotion: Ensure Unsigned Input Types Result in Unsigned Output for Sum and Multiply ([#14679](https://github.com/rapidsai/cudf/pull/14679)) [@SurajAralihalli](https://github.com/SurajAralihalli) -- Add BaseOffset as a final proxy type to pass instancechecks for offsets against `BaseOffset` ([#14678](https://github.com/rapidsai/cudf/pull/14678)) [@shwina](https://github.com/shwina) -- Add row conversion code from spark-rapids-jni ([#14664](https://github.com/rapidsai/cudf/pull/14664)) [@ttnghia](https://github.com/ttnghia) -- Unconditionally export the CCCL path ([#14656](https://github.com/rapidsai/cudf/pull/14656)) [@vyasr](https://github.com/vyasr) -- Ensure libcudf searches for our patched version of CCCL first ([#14655](https://github.com/rapidsai/cudf/pull/14655)) [@robertmaynard](https://github.com/robertmaynard) -- Constrain CUDA in notebook testing to prevent CUDA 12.1 usage until we have pynvjitlink ([#14648](https://github.com/rapidsai/cudf/pull/14648)) [@vyasr](https://github.com/vyasr) -- Fix invalid memory access in Parquet reader ([#14637](https://github.com/rapidsai/cudf/pull/14637)) [@etseidl](https://github.com/etseidl) -- Use column_empty over as_column([]) ([#14632](https://github.com/rapidsai/cudf/pull/14632)) [@mroeschke](https://github.com/mroeschke) -- Add (implicit) handling for torch tensors in is_scalar ([#14623](https://github.com/rapidsai/cudf/pull/14623)) [@wence-](https://github.com/wence-) -- Fix astype/fillna not maintaining column subclass and types ([#14615](https://github.com/rapidsai/cudf/pull/14615)) [@mroeschke](https://github.com/mroeschke) -- Remove non-empty nulls in cudf::get_json_object ([#14609](https://github.com/rapidsai/cudf/pull/14609)) [@davidwendt](https://github.com/davidwendt) -- Remove `cuda::proclaim_return_type` from nested lambda ([#14607](https://github.com/rapidsai/cudf/pull/14607)) [@ttnghia](https://github.com/ttnghia) -- Fix DataFrame.reindex when column reindexing to MultiIndex/RangeIndex ([#14605](https://github.com/rapidsai/cudf/pull/14605)) [@mroeschke](https://github.com/mroeschke) -- Address potential race conditions in Parquet reader ([#14602](https://github.com/rapidsai/cudf/pull/14602)) [@etseidl](https://github.com/etseidl) -- Fix DataFrame.reindex removing column name ([#14601](https://github.com/rapidsai/cudf/pull/14601)) [@mroeschke](https://github.com/mroeschke) -- Remove unsanitized input test data from copy gtests ([#14600](https://github.com/rapidsai/cudf/pull/14600)) [@davidwendt](https://github.com/davidwendt) -- Fix race detected in Parquet writer ([#14598](https://github.com/rapidsai/cudf/pull/14598)) [@etseidl](https://github.com/etseidl) -- Correct invalid or missing return types ([#14587](https://github.com/rapidsai/cudf/pull/14587)) [@robertmaynard](https://github.com/robertmaynard) -- Fix unsanitized nulls from strings segmented-reduce ([#14586](https://github.com/rapidsai/cudf/pull/14586)) [@davidwendt](https://github.com/davidwendt) -- Upgrade to nvCOMP 3.0.5 ([#14581](https://github.com/rapidsai/cudf/pull/14581)) [@davidwendt](https://github.com/davidwendt) -- Fix unsanitized nulls produced by `cudf::clamp` APIs ([#14580](https://github.com/rapidsai/cudf/pull/14580)) [@davidwendt](https://github.com/davidwendt) -- Fix unsanitized nulls produced by libcudf dictionary decode ([#14578](https://github.com/rapidsai/cudf/pull/14578)) [@davidwendt](https://github.com/davidwendt) -- Fixes a symbol group lookup table issue ([#14561](https://github.com/rapidsai/cudf/pull/14561)) [@elstehle](https://github.com/elstehle) -- Drop llvm16 from cuda118-conda devcontainer image ([#14526](https://github.com/rapidsai/cudf/pull/14526)) [@charlesbluca](https://github.com/charlesbluca) -- REF: Make DataFrame.from_pandas process by column ([#14483](https://github.com/rapidsai/cudf/pull/14483)) [@mroeschke](https://github.com/mroeschke) -- Improve memory footprint of isin by using contains ([#14478](https://github.com/rapidsai/cudf/pull/14478)) [@wence-](https://github.com/wence-) -- Move creation of env.yaml outside the current directory ([#14476](https://github.com/rapidsai/cudf/pull/14476)) [@davidwendt](https://github.com/davidwendt) -- Enable `pd.Timestamp` objects to be picklable when `cudf.pandas` is active ([#14474](https://github.com/rapidsai/cudf/pull/14474)) [@shwina](https://github.com/shwina) -- Correct dtype of count aggregations on empty dataframes ([#14473](https://github.com/rapidsai/cudf/pull/14473)) [@wence-](https://github.com/wence-) -- Avoid DataFrame conversion in `MultiIndex.from_pandas` ([#14470](https://github.com/rapidsai/cudf/pull/14470)) [@mroeschke](https://github.com/mroeschke) -- JSON writer: avoid default stream use in `string_scalar` constructors ([#14444](https://github.com/rapidsai/cudf/pull/14444)) [@vuule](https://github.com/vuule) -- Fix default stream use in the CSV reader ([#14443](https://github.com/rapidsai/cudf/pull/14443)) [@vuule](https://github.com/vuule) -- Preserve DataFrame(columns=).columns dtype during empty-like construction ([#14381](https://github.com/rapidsai/cudf/pull/14381)) [@mroeschke](https://github.com/mroeschke) -- Defer PTX file load to runtime ([#13690](https://github.com/rapidsai/cudf/pull/13690)) [@brandon-b-miller](https://github.com/brandon-b-miller) - -## 📖 Documentation - -- Disable parallel build ([#14796](https://github.com/rapidsai/cudf/pull/14796)) [@vyasr](https://github.com/vyasr) -- Add pylibcudf to the docs ([#14791](https://github.com/rapidsai/cudf/pull/14791)) [@vyasr](https://github.com/vyasr) -- Describe unpickling expectations when cudf.pandas is enabled ([#14693](https://github.com/rapidsai/cudf/pull/14693)) [@shwina](https://github.com/shwina) -- Update CONTRIBUTING for pyproject-only builds ([#14653](https://github.com/rapidsai/cudf/pull/14653)) [@vyasr](https://github.com/vyasr) -- More doxygen fixes ([#14639](https://github.com/rapidsai/cudf/pull/14639)) [@vyasr](https://github.com/vyasr) -- Enable doxygen XML generation and fix issues ([#14477](https://github.com/rapidsai/cudf/pull/14477)) [@vyasr](https://github.com/vyasr) -- Some doxygen improvements ([#14469](https://github.com/rapidsai/cudf/pull/14469)) [@vyasr](https://github.com/vyasr) -- Remove warning in dask-cudf docs ([#14454](https://github.com/rapidsai/cudf/pull/14454)) [@wence-](https://github.com/wence-) -- Update README links with redirects. ([#14378](https://github.com/rapidsai/cudf/pull/14378)) [@bdice](https://github.com/bdice) -- Add pip install instructions to README ([#13677](https://github.com/rapidsai/cudf/pull/13677)) [@shwina](https://github.com/shwina) - -## 🚀 New Features - -- Add ci check for external kernels ([#14768](https://github.com/rapidsai/cudf/pull/14768)) [@robertmaynard](https://github.com/robertmaynard) -- JSON single quote normalization API ([#14729](https://github.com/rapidsai/cudf/pull/14729)) [@shrshi](https://github.com/shrshi) -- Write cuDF version in Parquet "created_by" metadata field ([#14721](https://github.com/rapidsai/cudf/pull/14721)) [@etseidl](https://github.com/etseidl) -- Implement remaining copying APIs in pylibcudf along with required helper functions ([#14640](https://github.com/rapidsai/cudf/pull/14640)) [@vyasr](https://github.com/vyasr) -- Don't constrain `numba<0.58` ([#14616](https://github.com/rapidsai/cudf/pull/14616)) [@brandon-b-miller](https://github.com/brandon-b-miller) -- Add DELTA_LENGTH_BYTE_ARRAY encoder and decoder for Parquet ([#14590](https://github.com/rapidsai/cudf/pull/14590)) [@etseidl](https://github.com/etseidl) -- JSON - Parse mixed types as string in JSON reader ([#14572](https://github.com/rapidsai/cudf/pull/14572)) [@karthikeyann](https://github.com/karthikeyann) -- JSON quote normalization ([#14545](https://github.com/rapidsai/cudf/pull/14545)) [@shrshi](https://github.com/shrshi) -- Make DefaultHostMemoryAllocator settable ([#14523](https://github.com/rapidsai/cudf/pull/14523)) [@gerashegalov](https://github.com/gerashegalov) -- Implement more copying APIs in pylibcudf ([#14508](https://github.com/rapidsai/cudf/pull/14508)) [@vyasr](https://github.com/vyasr) -- Include writer code and writerVersion in ORC files ([#14458](https://github.com/rapidsai/cudf/pull/14458)) [@vuule](https://github.com/vuule) -- Parquet sub-rowgroup reading. ([#14360](https://github.com/rapidsai/cudf/pull/14360)) [@nvdbaranec](https://github.com/nvdbaranec) -- Move chars column to parent data buffer in strings column ([#14202](https://github.com/rapidsai/cudf/pull/14202)) [@karthikeyann](https://github.com/karthikeyann) -- PARQUET-2261 Size Statistics ([#14000](https://github.com/rapidsai/cudf/pull/14000)) [@etseidl](https://github.com/etseidl) -- Improve GroupBy JIT error handling ([#13854](https://github.com/rapidsai/cudf/pull/13854)) [@brandon-b-miller](https://github.com/brandon-b-miller) -- Generate unified Python/C++ docs ([#13846](https://github.com/rapidsai/cudf/pull/13846)) [@vyasr](https://github.com/vyasr) -- Expand JIT groupby test suite ([#13813](https://github.com/rapidsai/cudf/pull/13813)) [@brandon-b-miller](https://github.com/brandon-b-miller) - -## 🛠️ Improvements - -- Pin `pytest<8` ([#14920](https://github.com/rapidsai/cudf/pull/14920)) [@galipremsagar](https://github.com/galipremsagar) -- Move cudf::char_utf8 definition from detail to public header ([#14779](https://github.com/rapidsai/cudf/pull/14779)) [@davidwendt](https://github.com/davidwendt) -- Clean up `TimedeltaIndex.__init__` constructor ([#14775](https://github.com/rapidsai/cudf/pull/14775)) [@mroeschke](https://github.com/mroeschke) -- Clean up `DatetimeIndex.__init__` constructor ([#14774](https://github.com/rapidsai/cudf/pull/14774)) [@mroeschke](https://github.com/mroeschke) -- Some `frame.py` typing, move seldom used methods in `frame.py` ([#14766](https://github.com/rapidsai/cudf/pull/14766)) [@mroeschke](https://github.com/mroeschke) -- Remove **kwargs from astype ([#14765](https://github.com/rapidsai/cudf/pull/14765)) [@mroeschke](https://github.com/mroeschke) -- fix benchmarks compatibility with newer pytest-cases ([#14764](https://github.com/rapidsai/cudf/pull/14764)) [@jameslamb](https://github.com/jameslamb) -- Add `pynvjitlink` as a dependency ([#14763](https://github.com/rapidsai/cudf/pull/14763)) [@brandon-b-miller](https://github.com/brandon-b-miller) -- Resolve degenerate performance in `create_structs_data` ([#14761](https://github.com/rapidsai/cudf/pull/14761)) [@SurajAralihalli](https://github.com/SurajAralihalli) -- Simplify ColumnAccessor methods; avoid unnecessary validations ([#14758](https://github.com/rapidsai/cudf/pull/14758)) [@mroeschke](https://github.com/mroeschke) -- Pin pytest-cases<3.8.2 ([#14756](https://github.com/rapidsai/cudf/pull/14756)) [@mroeschke](https://github.com/mroeschke) -- Use _from_data instead of _from_columns for initialzing Frame ([#14755](https://github.com/rapidsai/cudf/pull/14755)) [@mroeschke](https://github.com/mroeschke) -- Consolidate cudf object handling in as_column ([#14754](https://github.com/rapidsai/cudf/pull/14754)) [@mroeschke](https://github.com/mroeschke) -- Reduce execution time of Parquet C++ tests ([#14750](https://github.com/rapidsai/cudf/pull/14750)) [@vuule](https://github.com/vuule) -- Implement to_datetime(..., utc=True) ([#14749](https://github.com/rapidsai/cudf/pull/14749)) [@mroeschke](https://github.com/mroeschke) -- Remove usages of rapids-env-update ([#14748](https://github.com/rapidsai/cudf/pull/14748)) [@KyleFromNVIDIA](https://github.com/KyleFromNVIDIA) -- Provide explicit pool size and avoid RMM detail APIs ([#14741](https://github.com/rapidsai/cudf/pull/14741)) [@harrism](https://github.com/harrism) -- Implement `cudf.MultiIndex.from_arrays` ([#14740](https://github.com/rapidsai/cudf/pull/14740)) [@mroeschke](https://github.com/mroeschke) -- Remove unused/single use methods ([#14739](https://github.com/rapidsai/cudf/pull/14739)) [@mroeschke](https://github.com/mroeschke) -- refactor CUDA versions in dependencies.yaml ([#14733](https://github.com/rapidsai/cudf/pull/14733)) [@jameslamb](https://github.com/jameslamb) -- Remove unneeded methods in Column ([#14730](https://github.com/rapidsai/cudf/pull/14730)) [@mroeschke](https://github.com/mroeschke) -- Clean up base column methods ([#14725](https://github.com/rapidsai/cudf/pull/14725)) [@mroeschke](https://github.com/mroeschke) -- Ensure column.fillna signatures are consistent ([#14724](https://github.com/rapidsai/cudf/pull/14724)) [@mroeschke](https://github.com/mroeschke) -- Remove mimesis as a testing dependency ([#14723](https://github.com/rapidsai/cudf/pull/14723)) [@mroeschke](https://github.com/mroeschke) -- Replace as_numerical with as_numerical_column/codes ([#14719](https://github.com/rapidsai/cudf/pull/14719)) [@mroeschke](https://github.com/mroeschke) -- Use offsetalator in gather_chars ([#14700](https://github.com/rapidsai/cudf/pull/14700)) [@davidwendt](https://github.com/davidwendt) -- Use make_strings_children for fill() specialization logic ([#14697](https://github.com/rapidsai/cudf/pull/14697)) [@davidwendt](https://github.com/davidwendt) -- Change `io::detail::orc` namespace into `io::orc::detail` ([#14696](https://github.com/rapidsai/cudf/pull/14696)) [@ttnghia](https://github.com/ttnghia) -- Fix call to deprecated factory function ([#14695](https://github.com/rapidsai/cudf/pull/14695)) [@davidwendt](https://github.com/davidwendt) -- Use as_column instead of arange for range like inputs ([#14689](https://github.com/rapidsai/cudf/pull/14689)) [@mroeschke](https://github.com/mroeschke) -- Reorganize ORC reader into multiple files and perform some small fixes to cuIO code ([#14665](https://github.com/rapidsai/cudf/pull/14665)) [@ttnghia](https://github.com/ttnghia) -- Split parquet test into multiple files ([#14663](https://github.com/rapidsai/cudf/pull/14663)) [@etseidl](https://github.com/etseidl) -- Custom error messages for IO with nonexistent files ([#14662](https://github.com/rapidsai/cudf/pull/14662)) [@vuule](https://github.com/vuule) -- Explicitly pass .dtype into is_foo_dtype functions ([#14657](https://github.com/rapidsai/cudf/pull/14657)) [@mroeschke](https://github.com/mroeschke) -- Basic validation in reader benchmarks ([#14647](https://github.com/rapidsai/cudf/pull/14647)) [@vuule](https://github.com/vuule) -- Update dependencies.yaml to support CUDA 12.*. ([#14644](https://github.com/rapidsai/cudf/pull/14644)) [@bdice](https://github.com/bdice) -- Consolidate memoryview handling in as_column ([#14643](https://github.com/rapidsai/cudf/pull/14643)) [@mroeschke](https://github.com/mroeschke) -- Convert `FieldType` to scoped enum ([#14642](https://github.com/rapidsai/cudf/pull/14642)) [@vuule](https://github.com/vuule) -- Use instance over is_foo_dtype ([#14641](https://github.com/rapidsai/cudf/pull/14641)) [@mroeschke](https://github.com/mroeschke) -- Use isinstance over is_foo_dtype internally ([#14638](https://github.com/rapidsai/cudf/pull/14638)) [@mroeschke](https://github.com/mroeschke) -- Remove unnecessary **kwargs in function signatures ([#14635](https://github.com/rapidsai/cudf/pull/14635)) [@mroeschke](https://github.com/mroeschke) -- Drop nvbench patch for nvml. ([#14631](https://github.com/rapidsai/cudf/pull/14631)) [@bdice](https://github.com/bdice) -- Drop Pascal GPU support. ([#14630](https://github.com/rapidsai/cudf/pull/14630)) [@bdice](https://github.com/bdice) -- Add cpp/doxygen/xml to .gitignore ([#14613](https://github.com/rapidsai/cudf/pull/14613)) [@davidwendt](https://github.com/davidwendt) -- Create strings-specific make_offsets_child_column for multiple offset types ([#14612](https://github.com/rapidsai/cudf/pull/14612)) [@davidwendt](https://github.com/davidwendt) -- Use the offsetalator in cudf::concatenate for strings ([#14611](https://github.com/rapidsai/cudf/pull/14611)) [@davidwendt](https://github.com/davidwendt) -- Make Parquet ColumnIndex null_counts optional ([#14596](https://github.com/rapidsai/cudf/pull/14596)) [@etseidl](https://github.com/etseidl) -- Support `freq` in DatetimeIndex ([#14593](https://github.com/rapidsai/cudf/pull/14593)) [@shwina](https://github.com/shwina) -- Remove legacy benchmarks for cuDF-python ([#14591](https://github.com/rapidsai/cudf/pull/14591)) [@osidekyle](https://github.com/osidekyle) -- Remove WORKSPACE env var from cudf_test temp_directory class ([#14588](https://github.com/rapidsai/cudf/pull/14588)) [@davidwendt](https://github.com/davidwendt) -- Use exceptions instead of return values to handle errors in `CompactProtocolReader` ([#14582](https://github.com/rapidsai/cudf/pull/14582)) [@vuule](https://github.com/vuule) -- Use cuda::proclaim_return_type on device lambdas. ([#14577](https://github.com/rapidsai/cudf/pull/14577)) [@bdice](https://github.com/bdice) -- Update to CCCL 2.2.0. ([#14576](https://github.com/rapidsai/cudf/pull/14576)) [@bdice](https://github.com/bdice) -- Update dependencies.yaml to new pip index ([#14575](https://github.com/rapidsai/cudf/pull/14575)) [@vyasr](https://github.com/vyasr) -- Simplify Python CMake ([#14565](https://github.com/rapidsai/cudf/pull/14565)) [@vyasr](https://github.com/vyasr) -- Java expose parquet pass_read_limit ([#14564](https://github.com/rapidsai/cudf/pull/14564)) [@revans2](https://github.com/revans2) -- Add column sanitization checks in `CUDF_TEST_EXPECT_COLUMN_*` macros ([#14559](https://github.com/rapidsai/cudf/pull/14559)) [@SurajAralihalli](https://github.com/SurajAralihalli) -- Use cudf_test temp_directory class for nvtext::subword_tokenize gbenchmark ([#14558](https://github.com/rapidsai/cudf/pull/14558)) [@davidwendt](https://github.com/davidwendt) -- Fix return type of prefix increment overloads ([#14544](https://github.com/rapidsai/cudf/pull/14544)) [@vuule](https://github.com/vuule) -- Make bpe_merge_pairs_impl member private ([#14543](https://github.com/rapidsai/cudf/pull/14543)) [@davidwendt](https://github.com/davidwendt) -- Small clean up in `io::statistics` ([#14542](https://github.com/rapidsai/cudf/pull/14542)) [@vuule](https://github.com/vuule) -- Change json gtest environment variable to compile-time definition ([#14541](https://github.com/rapidsai/cudf/pull/14541)) [@davidwendt](https://github.com/davidwendt) -- Remove extra total chars size calculation from cudf::concatenate ([#14540](https://github.com/rapidsai/cudf/pull/14540)) [@davidwendt](https://github.com/davidwendt) -- Refactor IndexedFrame.hash_values to use cudf::hashing functions, add xxhash64 to cudf Python. ([#14538](https://github.com/rapidsai/cudf/pull/14538)) [@bdice](https://github.com/bdice) -- Move non-templated inline function definitions from table_view.hpp to table_view.cpp ([#14535](https://github.com/rapidsai/cudf/pull/14535)) [@davidwendt](https://github.com/davidwendt) -- Add JNI for strings::code_points ([#14533](https://github.com/rapidsai/cudf/pull/14533)) [@thirtiseven](https://github.com/thirtiseven) -- Add a test for issue 12773 ([#14529](https://github.com/rapidsai/cudf/pull/14529)) [@vyasr](https://github.com/vyasr) -- Split libarrow build dependencies. ([#14506](https://github.com/rapidsai/cudf/pull/14506)) [@bdice](https://github.com/bdice) -- Implement `IndexedFrame.duplicated` with `distinct_indices` + `scatter` ([#14493](https://github.com/rapidsai/cudf/pull/14493)) [@wence-](https://github.com/wence-) -- Expunge as_frame conversions in Column algorithms ([#14491](https://github.com/rapidsai/cudf/pull/14491)) [@wence-](https://github.com/wence-) -- Remove unsanitized null from input strings column in rank_tests.cpp ([#14475](https://github.com/rapidsai/cudf/pull/14475)) [@davidwendt](https://github.com/davidwendt) -- Refactor Parquet kernel_error ([#14464](https://github.com/rapidsai/cudf/pull/14464)) [@etseidl](https://github.com/etseidl) -- Deprecate cudf::make_strings_column accepting typed offsets ([#14461](https://github.com/rapidsai/cudf/pull/14461)) [@davidwendt](https://github.com/davidwendt) -- Remove deprecated nvtext::load_merge_pairs_file ([#14460](https://github.com/rapidsai/cudf/pull/14460)) [@davidwendt](https://github.com/davidwendt) -- Introduce Comprehensive Pathological Unit Tests for Issue #14409 ([#14459](https://github.com/rapidsai/cudf/pull/14459)) [@aocsa](https://github.com/aocsa) -- Expose stream parameter in public nvtext APIs ([#14456](https://github.com/rapidsai/cudf/pull/14456)) [@davidwendt](https://github.com/davidwendt) -- Include encode type in the error message when unsupported Parquet encoding is detected ([#14453](https://github.com/rapidsai/cudf/pull/14453)) [@ZelboK](https://github.com/ZelboK) -- Remove null mask for zero nulls in json readers ([#14451](https://github.com/rapidsai/cudf/pull/14451)) [@karthikeyann](https://github.com/karthikeyann) -- Refactor cudf.Series.__init__ ([#14450](https://github.com/rapidsai/cudf/pull/14450)) [@mroeschke](https://github.com/mroeschke) -- Remove the use of `volatile` in Parquet ([#14448](https://github.com/rapidsai/cudf/pull/14448)) [@vuule](https://github.com/vuule) -- REF: Remove **kwargs from to_pandas, raise if nullable is not implemented ([#14438](https://github.com/rapidsai/cudf/pull/14438)) [@mroeschke](https://github.com/mroeschke) -- Testing stream pool implementation ([#14437](https://github.com/rapidsai/cudf/pull/14437)) [@shrshi](https://github.com/shrshi) -- Match pandas join ordering obligations in pandas-compatible mode ([#14428](https://github.com/rapidsai/cudf/pull/14428)) [@wence-](https://github.com/wence-) -- Forward-merge branch-23.12 to branch-24.02 ([#14426](https://github.com/rapidsai/cudf/pull/14426)) [@bdice](https://github.com/bdice) -- Use isinstance(..., cudf.IntervalDtype) instead of is_interval_dtype ([#14424](https://github.com/rapidsai/cudf/pull/14424)) [@mroeschke](https://github.com/mroeschke) -- Use isinstance(..., cudf.CategoricalDtype) instead of is_categorical_dtype ([#14423](https://github.com/rapidsai/cudf/pull/14423)) [@mroeschke](https://github.com/mroeschke) -- Forward-merge branch-23.12 to branch-24.02 ([#14422](https://github.com/rapidsai/cudf/pull/14422)) [@bdice](https://github.com/bdice) -- REF: Remove instances of pd.core ([#14421](https://github.com/rapidsai/cudf/pull/14421)) [@mroeschke](https://github.com/mroeschke) -- Expose streams in public filling APIs for label_bins ([#14401](https://github.com/rapidsai/cudf/pull/14401)) [@ZelboK](https://github.com/ZelboK) -- Consolidate 1D pandas object handling in as_column ([#14394](https://github.com/rapidsai/cudf/pull/14394)) [@mroeschke](https://github.com/mroeschke) -- Limit DELTA_BINARY_PACKED encoder to the same number of bits as the physical type being encoded ([#14392](https://github.com/rapidsai/cudf/pull/14392)) [@etseidl](https://github.com/etseidl) -- Add SHA-1 and SHA-2 hash functions. ([#14391](https://github.com/rapidsai/cudf/pull/14391)) [@bdice](https://github.com/bdice) -- Expose streams in Parquet reader and writer APIs ([#14359](https://github.com/rapidsai/cudf/pull/14359)) [@shrshi](https://github.com/shrshi) -- Update to fmt 10.1.1 and spdlog 1.12.0. ([#14355](https://github.com/rapidsai/cudf/pull/14355)) [@bdice](https://github.com/bdice) -- Replace default stream for scalars and column factories usages (because of defaulted arguments) ([#14354](https://github.com/rapidsai/cudf/pull/14354)) [@karthikeyann](https://github.com/karthikeyann) -- Expose streams in ORC reader and writer APIs ([#14350](https://github.com/rapidsai/cudf/pull/14350)) [@shrshi](https://github.com/shrshi) -- Convert compression and io to string axis type in IO benchmarks ([#14347](https://github.com/rapidsai/cudf/pull/14347)) [@SurajAralihalli](https://github.com/SurajAralihalli) -- Add cuDF devcontainers ([#14015](https://github.com/rapidsai/cudf/pull/14015)) [@trxcllnt](https://github.com/trxcllnt) -- Refactoring of Buffers (last step towards unifying COW and Spilling) ([#13801](https://github.com/rapidsai/cudf/pull/13801)) [@madsbk](https://github.com/madsbk) -- Switch to scikit-build-core ([#13531](https://github.com/rapidsai/cudf/pull/13531)) [@vyasr](https://github.com/vyasr) -- Simplify null count checking in column equality comparator ([#13312](https://github.com/rapidsai/cudf/pull/13312)) [@vyasr](https://github.com/vyasr) - # cuDF 23.12.00 (6 Dec 2023) ## 🚨 Breaking Changes