diff --git a/NEWS.md b/NEWS.md index 8f9e56a81a..9d2a74aa6d 100644 --- a/NEWS.md +++ b/NEWS.md @@ -2,6 +2,10 @@ # data.table [v1.15.4](https://github.com/Rdatatable/data.table/milestone/33) (in progress) +## BUG FIXES + +1. Optimized `shift` per group produced wrong results when simultaneously subsetting, for example, `DT[i==1L, shift(x), by=group]`, [#5962](https://github.com/Rdatatable/data.table/issues/5962). Thanks to @renkun-ken for the report and Benjamin Schwendinger for the fix. + # data.table [v1.15.2](https://github.com/Rdatatable/data.table/milestone/32) (27 Feb 2024) ## BUG FIXES @@ -10,7 +14,19 @@ 2. `shift()` of a vector in grouped queries (under GForce) returns a vector, consistent with `shift()` in other contexts, [#5939](https://github.com/Rdatatable/data.table/issues/5939). Thanks @shrektan for the report and @MichaelChirico for the fix. -# data.table [v1.15.0](https://github.com/Rdatatable/data.table/milestone/29) (14 Jan 2024) +## NOTES + +1. `transform` method for data.table sped up substantially when creating new columns on large tables. Thanks to @OfekShilon for the report and PR. The implemented solution was proposed by @ColeMiller1. + +2. The documentation for the `fill` argument in `rbind()` and `rbindlist()` now notes the expected behaviour for missing `list` columns when `fill=TRUE`, namely to use `NULL` (not `NA`), [#4198](https://github.com/Rdatatable/data.table/pull/4198). Thanks @sritchie73 for the proposal and fix. + +3. data.table now depends on R 3.2.0 (2015) instead of 3.1.0 (2014). 1.17.0 will likely move to R 3.3.0 (2016). Recent versions of R have good features that we would gradually like to incorporate, and we see next to no usage of these very old versions of R. + +4. Erroneous assignment calls in `[` with a trailing comma (e.g. ``DT[, `:=`(a = 1, b = 2,)]``) get a friendlier error since this situation is common during refactoring and easy to miss visually. Thanks @MichaelChirico for the fix. + +5. Input files are now kept open during `mmap()` when running under Emscripten, [emscripten-core/emscripten#20459](https://github.com/emscripten-core/emscripten/issues/20459). This avoids an error in `fread()` when running in WebAssembly, [#5969](https://github.com/Rdatatable/data.table/issues/5969). Thanks to @maek-ies for the report and @georgestagg for the PR. + +6. `dcast()` message about `fun.aggregate` defaulting to `length()` when aggregation is necessary, which could be confusing if duplicates were unexpected, does better explaining the behavior and suggesting alternatives, [#5217](https://github.com/Rdatatable/data.table/issues/5217). Thanks @MichaelChirico for the suggestion and @Nj221102 for the fix. ## BREAKING CHANGE diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index 6c66d4481a..b62f7703e9 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -17835,6 +17835,9 @@ test(2224.3, DT[, shift(x), g], error="Type 'list' is not supported by GForce gs # single vector is un-list()'d, consistently with plain shift(), #5939 DT = data.table(x=1, g=1) test(2224.4, DT[, .(shift(x)), g], DT[, shift(x), g]) +# gshift with subset #5962 +DT = data.table(x = 1:3, g = rep(1:2, each=3L)) +test(2224.5, DT[g==1L, shift(x), by=g, verbose=TRUE], data.table(g=1L, V1=c(NA,1L,2L)), output="Making each group and running j (GForce TRUE)") # groupingsets by named by argument test(2225.1, groupingsets(data.table(iris), j=sum(Sepal.Length), by=c('Sp'='Species'), sets=list('Species')), diff --git a/src/gsumm.c b/src/gsumm.c index efa13883e0..48b6a01c09 100644 --- a/src/gsumm.c +++ b/src/gsumm.c @@ -1203,7 +1203,7 @@ SEXP gshift(SEXP x, SEXP nArg, SEXP fillArg, SEXP typeArg) { bool lag; const bool cycle = stype == CYCLIC; - R_xlen_t nx = xlength(x), nk = length(nArg); + R_xlen_t nk = length(nArg); if (!isInteger(nArg)) error(_("Internal error: n must be integer")); // # nocov const int *kd = INTEGER(nArg); for (int i=0; i