Skip to content

vctrs 0.4.0

Compare
Choose a tag to compare
@lionel- lionel- released this 30 Mar 11:44
· 384 commits to main since this release
  • New experimental vec_locate_sorted_groups() for returning the locations of
    groups in sorted order. This is equivalent to, but faster than, calling
    vec_group_loc() and then sorting by the key column of the result.

  • New experimental vec_locate_matches() for locating where each observation
    in one vector matches one or more observations in another vector. It is
    similar to vec_match(), but returns all matches by default (rather than just
    the first), and can match on binary conditions other than equality. The
    algorithm is inspired by data.table's very fast binary merge procedure.

  • The vec_proxy_equal(), vec_proxy_compare(), and vec_proxy_order()
    methods for vctrs_rcrd are now applied recursively over the fields (#1503).

  • Lossy cast errors now inherit from incompatible type errors.

  • vec_is_list() now returns TRUE for AsIs lists (#1463).

  • vec_assert(), vec_ptype2(), vec_cast(), and vec_as_location()
    now use caller_arg() to infer a default arg value from the
    caller.

    This may result in unhelpful arguments being mentioned in error
    messages. In general, you should consider snapshotting vctrs error
    messages thrown in your package and supply arg and call
    arguments if the error context is not adequately reported to your
    users.

  • vec_ptype_common(), vec_cast_common(), vec_size_common(), and
    vec_recycle_common() gain call and arg arguments for
    specifying an error context.

  • vec_compare() can now compare zero column data frames (#1500).

  • new_data_frame() now errors on negative and missing n values (#1477).

  • vec_order() now correctly orders zero column data frames (#1499).

  • vctrs now depends on cli to help with error message generation.

  • New vec_check_list() and list_check_all_vectors() input
    checkers, and an accompanying list_all_vectors() predicate.

  • New vec_interleave() for combining multiple vectors together, interleaving
    their elements in the process (#1396).

  • vec_equal_na(NULL) now returns logical(0) rather than erroring (#1494).

  • vec_as_location(missing = "error") now fails with NA and NA_character_
    in addition to NA_integer_ (#1420, @krlmlr).

  • Starting with rlang 1.0.0, errors are displayed with the contextual
    function call. Several vctrs operations gain a call argument that
    makes it possible to report the correct context in error messages.
    This concerns:

    • vec_cast() and vec_ptype2()
    • vec_default_cast() and vec_default_ptype2()
    • vec_assert()
    • vec_as_names()
    • stop_ constructors like stop_incompatible_type()

    Note that default vec_cast() and vec_ptype2() methods
    automatically support this if they pass ... to the corresponding
    vec_default_ functions. If you throw a non-internal error from a
    non-default method, add a call = caller_env() argument in the
    method and pass it to rlang::abort().

  • If NA_character_ is specified as a name for vctrs_vctr objects, it is
    now automatically repaired to "" (#780).

  • "" is now an allowed name for vctrs_vctr objects and all its
    subclasses (vctrs_list_of in particular) (#780).

  • list_of() is now much faster when many values are provided.

  • vec_as_location() evaluates arg only in case of error, for performance
    (#1150, @krlmlr).

  • levels.vctrs_vctr() now returns NULL instead of failing (#1186, @krlmlr).

  • vec_assert() produces a more informative error when size is invalid
    (#1470).

  • vec_duplicate_detect() is a bit faster when there are many unique values.

  • vec_proxy_order() is described in vignette("s3-vectors") (#1373, @krlmlr).

  • vec_chop() now materializes ALTREP vectors before chopping, which is more
    efficient than creating many small ALTREP pieces (#1450).

  • New list_drop_empty() for removing empty elements from a list (#1395).

  • list_sizes() now propagates the names of the list onto the result.

  • Name repair messages are now signaled by rlang::names_inform_repair(). This
    means that the messages are now sent to stdout by default rather than to
    stderr, resulting in prettier messages. Additionally, name repair messages can
    now be silenced through the global option rlib_name_repair_verbosity, which
    is useful for testing purposes. See ?names_inform_repair for more
    information (#1429).

  • vctrs_vctr methods for na.omit(), na.exclude(), and na.fail() have
    been added (#1413).

  • vec_init() is now slightly faster (#1423).

  • vec_set_names() no longer corrupts vctrs_rcrd types (#1419).

  • vec_detect_complete() now computes completeness for vctrs_rcrd types in
    the same way as data frames, which means that if any field is missing, the
    entire record is considered incomplete (#1386).

  • The na_value argument of vec_order() and vec_sort() now correctly
    respect missing values in lists (#1401).

  • vec_rep() and vec_rep_each() are much faster for times = 0 and
    times = 1 (@mgirlich, #1392).

  • vec_equal_na() and vec_fill_missing() now work with integer64 vectors
    (#1304).

  • The xtfrm() method for vctrs_vctr objects no longer accidentally breaks
    ties (#1354).

  • min(), max() and range() no longer throw an error if na.rm = TRUE is
    set and all values are NA (@gorcha, #1357). In this case, and where an empty
    input is given, it will return Inf/-Inf, or NA if Inf can't be cast
    to the input type.

  • vec_group_loc(), used for grouping in dplyr, now correctly handles
    vectors with billions of elements (up to .Machine$integer.max) (#1133).