Skip to content

Commit

Permalink
gshift works in subset queries (#5963)
Browse files Browse the repository at this point in the history
* add fix for subset

* add NEWS

---------

Co-authored-by: Benjamin Schwendinger <[email protected]>
  • Loading branch information
MichaelChirico and ben-schwen committed Mar 8, 2024
1 parent fb1933d commit bc14b7a
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 3 deletions.
18 changes: 17 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

# data.table [v1.15.4](https://github.com/Rdatatable/data.table/milestone/33) (in progress)

## BUG FIXES

1. Optimized `shift` per group produced wrong results when simultaneously subsetting, for example, `DT[i==1L, shift(x), by=group]`, [#5962](https://github.com/Rdatatable/data.table/issues/5962). Thanks to @renkun-ken for the report and Benjamin Schwendinger for the fix.

# data.table [v1.15.2](https://github.com/Rdatatable/data.table/milestone/32) (27 Feb 2024)

## BUG FIXES
Expand All @@ -10,7 +14,19 @@

2. `shift()` of a vector in grouped queries (under GForce) returns a vector, consistent with `shift()` in other contexts, [#5939](https://github.com/Rdatatable/data.table/issues/5939). Thanks @shrektan for the report and @MichaelChirico for the fix.

# data.table [v1.15.0](https://github.com/Rdatatable/data.table/milestone/29) (14 Jan 2024)
## NOTES

1. `transform` method for data.table sped up substantially when creating new columns on large tables. Thanks to @OfekShilon for the report and PR. The implemented solution was proposed by @ColeMiller1.

2. The documentation for the `fill` argument in `rbind()` and `rbindlist()` now notes the expected behaviour for missing `list` columns when `fill=TRUE`, namely to use `NULL` (not `NA`), [#4198](https://github.com/Rdatatable/data.table/pull/4198). Thanks @sritchie73 for the proposal and fix.

3. data.table now depends on R 3.2.0 (2015) instead of 3.1.0 (2014). 1.17.0 will likely move to R 3.3.0 (2016). Recent versions of R have good features that we would gradually like to incorporate, and we see next to no usage of these very old versions of R.

4. Erroneous assignment calls in `[` with a trailing comma (e.g. ``DT[, `:=`(a = 1, b = 2,)]``) get a friendlier error since this situation is common during refactoring and easy to miss visually. Thanks @MichaelChirico for the fix.

5. Input files are now kept open during `mmap()` when running under Emscripten, [emscripten-core/emscripten#20459](https://github.com/emscripten-core/emscripten/issues/20459). This avoids an error in `fread()` when running in WebAssembly, [#5969](https://github.com/Rdatatable/data.table/issues/5969). Thanks to @maek-ies for the report and @georgestagg for the PR.

6. `dcast()` message about `fun.aggregate` defaulting to `length()` when aggregation is necessary, which could be confusing if duplicates were unexpected, does better explaining the behavior and suggesting alternatives, [#5217](https://github.com/Rdatatable/data.table/issues/5217). Thanks @MichaelChirico for the suggestion and @Nj221102 for the fix.

## BREAKING CHANGE

Expand Down
3 changes: 3 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -17835,6 +17835,9 @@ test(2224.3, DT[, shift(x), g], error="Type 'list' is not supported by GForce gs
# single vector is un-list()'d, consistently with plain shift(), #5939
DT = data.table(x=1, g=1)
test(2224.4, DT[, .(shift(x)), g], DT[, shift(x), g])
# gshift with subset #5962
DT = data.table(x = 1:3, g = rep(1:2, each=3L))
test(2224.5, DT[g==1L, shift(x), by=g, verbose=TRUE], data.table(g=1L, V1=c(NA,1L,2L)), output="Making each group and running j (GForce TRUE)")

# groupingsets by named by argument
test(2225.1, groupingsets(data.table(iris), j=sum(Sepal.Length), by=c('Sp'='Species'), sets=list('Species')),
Expand Down
4 changes: 2 additions & 2 deletions src/gsumm.c
Original file line number Diff line number Diff line change
Expand Up @@ -1203,7 +1203,7 @@ SEXP gshift(SEXP x, SEXP nArg, SEXP fillArg, SEXP typeArg) {
bool lag;
const bool cycle = stype == CYCLIC;

R_xlen_t nx = xlength(x), nk = length(nArg);
R_xlen_t nk = length(nArg);
if (!isInteger(nArg)) error(_("Internal error: n must be integer")); // # nocov
const int *kd = INTEGER(nArg);
for (int i=0; i<nk; i++) if (kd[i]==NA_INTEGER) error(_("Item %d of n is NA"), i+1);
Expand All @@ -1220,7 +1220,7 @@ SEXP gshift(SEXP x, SEXP nArg, SEXP fillArg, SEXP typeArg) {
}
R_xlen_t ansi = 0;
SEXP tmp;
SET_VECTOR_ELT(ans, g, tmp=allocVector(TYPEOF(x), nx));
SET_VECTOR_ELT(ans, g, tmp=allocVector(TYPEOF(x), n));
#define SHIFT(CTYPE, RTYPE, ASSIGN) { \
const CTYPE *xd = (const CTYPE *)RTYPE(x); \
const CTYPE fill = RTYPE(thisfill)[0]; \
Expand Down

0 comments on commit bc14b7a

Please sign in to comment.