Skip to content

Releases: apache/datasketches-cpp

datasketches-cpp-5.2.0

15 Jan 21:57
Compare
Choose a tag to compare
  • Implemented Bloom filter
  • Added get_PMF() and get_CDF() methods to t-digest
  • Fixed a bug in Theta sketch compression

datasketches-cpp-5.1.0

02 Aug 23:09
Compare
Choose a tag to compare
  • implemented tdigest
  • added get_serialized_size_bytes() and get_max_serialized_size_bytes() to compact Theta sketch
  • fixed compressed Theta sketch stream serialization
  • added Tuple sketch filter() method

5.0.2

13 Jan 07:20
Compare
Choose a tag to compare

This is patch update. The original 5.0.0 release notes are presented next with a cumulative set of patch update changes at the end.

This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.

  • moved all Python-related code to new datasketches-python repository
  • finished moving public constants to separate namespaces
  • removed deprecated methods (such as get_quantiles())
  • generalized array_of_doubles sketch as array_tuple_sketch
  • implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
  • fixed slowness in Theta intersection
  • fixed incompatibility of serialized empty frequent items sketches with Java

The patch release fixes:

  • a bug in KLL that could cause a self-move (undefined behavior) (5.0.1)
  • a bug in EBPPS Sampling's to_string() method that could cause compilation failure for non-string types (5.0.1)
  • use of a method in density sketch that was removed in C++17, breaking forward compatibility (5.0.2)

datasketches-cpp-5.0.1

22 Dec 23:46
7934d4c
Compare
Choose a tag to compare

This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.

  • moved all Python-related code to new datasketches-python repository
  • finished moving public constants to separate namespaces
  • removed deprecated methods (such as get_quantiles())
  • generalized array_of_doubles sketch as array_tuple_sketch
  • implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
  • fixed slowness in Theta intersection
  • fixed incompatibility of serialized empty frequent items sketches with Java

The patch release fixes:

  • a bug in KLL that could cause a self-move (undefined behavior)
  • a bug in EBPPS Sampling's to_string() method that could cause compilation failure for non-string types

datasketches-cpp-5.0.0

13 Nov 23:28
Compare
Choose a tag to compare

This is a major release due to separation of Python part of the library into its own repository datasketches-python, which can potentially be API-breaking for somebody. We also took this opportunity to do some other possibly API-breaking cleanup.

  • moved all Python-related code to new datasketches-python repository
  • finished moving public constants to separate namespaces
  • removed deprecated methods (such as get_quantiles())
  • generalized array_of_doubles sketch as array_tuple_sketch
  • implemented new EB-PPS sketch (exact PPS sampling with bounded sample size)
  • fixed slowness in Theta intersection
  • fixed incompatibility of serialized empty frequent items sketches with Java

datasketches-cpp-4.1.0

03 May 19:20
Compare
Choose a tag to compare
  • HLL union speed improvement
  • Fixed a bug in theta and tuple union base
  • new density sketch
  • new count min sketch
  • thread local random generator
  • generic quantile sketches in Python (KLL, REQ, classic quantiles)
  • generic frequent items sketch in Python
  • generic tuple sketch in Python
  • added optional compression of serialized theta sketch
  • iterators use new style (no inheritance from std::iterator)

datasketches-cpp-4.0.1

31 Jan 19:47
b91740d
Compare
Choose a tag to compare

This is a patch release with only very minor code changes to address several small compiler warnings.

The main difference is that the associated Python wheels distributed as convenience binaries (and not included in git) are now produced for ARM64 architectures, which should provide increased compatibility with several major cloud computing providers.

datasketches-cpp-4.0.0

06 Dec 00:52
Compare
Choose a tag to compare

This is a major release with some API-breaking changes

  • Common sorted view used by all quantiles sketches with simultaneous support for both inclusive and exclusive modes
  • The default mode for all methods for querying quantiles sketches was changed from exclusive to inclusive
  • The mode is now a method parameter, not a template parameter
  • Queries of empty quantiles sketches such as get_rank() and get_quantile() will throw an exception now (returned NaN for floating point types before)
  • SerDe was removed from class templates and added to the relevant method templates (such as serialize and deserialize)
  • Support for comparator instances in quantiles sketches
  • Support for equality operator instance in frequent items sketch
  • Added operator-> to iterators over quantiles sketches

v3.5.1

05 Nov 17:13
a44cb33
Compare
Choose a tag to compare

Patch release, no new features:

  • Fix python wheel build script to produce valid wheels for Apple Silicon Macs
  • Fix a serialization bug for theta and tuple sketches when sketch had no entries but was not empty (e.g. the result of an intersection between disjoint sets)

datasketches-cpp-3.5.0

13 Jul 20:35
Compare
Choose a tag to compare
  • Type converting constructors for KLL and REQ sketches
  • Fixed KLL copy constructor (affects non-arithmetic types)
  • Added internal check in CPC sketch compression to avoid problems with static analysis