Skip to content

Commit

Permalink
Modernize ?setkey; and tackle long-deprecated functions (#3399)
Browse files Browse the repository at this point in the history
  • Loading branch information
MichaelChirico authored and mattdowle committed Mar 2, 2019
1 parent 8105bae commit e192793
Show file tree
Hide file tree
Showing 10 changed files with 100 additions and 114 deletions.
3 changes: 2 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ exportClasses(data.table, IDate, ITime)
##

export(data.table, tables, setkey, setkeyv, key, "key<-", haskey, CJ, SJ, copy)
export(set2key, set2keyv, key2, setindex, setindexv, indices)
export(setindex, setindexv, indices)
export(set2key, set2keyv, key2) # deprecated with helpful error; remove after May 2019 (see #3399)
export(as.data.table,is.data.table,test.data.table,last,first,like,"%like%",between,"%between%",inrange,"%inrange%")
export(timetaken)
export(truelength, alloc.col, ":=")
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,10 @@
Column 2 of by= (2) is type 'list', not yet supported. Please use the by= argument to specify columns with types that are supported.
```
9. Reminder that note 11 in v1.11.0 (May 2018) warned that `set2key()` and `key2()` will be removed in May 2019. They have been warning since v1.9.8 (Nov 2016) and their warnings were upgraded to errors in v1.11.0 (May 2018). When they were introduced in version 1.9.4 (Oct 2014) they were marked as 'experimental'.
10. The `key(DT)<-` form of `setkey()` has been warning since at least 2012 to use `setkey()`. The warning is now stronger: `key(x)<-value is deprecated and not supported. Please change to use setkey().`. This warning will be upgraded to error in one year.
### Changes in v1.12.0 (13 Jan 2019)
Expand Down
2 changes: 1 addition & 1 deletion R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -1173,7 +1173,7 @@ chmatch2 <- function(x, table, nomatch=NA_integer_) {
m[is.na(m)] = ncol(x)+seq_len(length(newnames))
cols = as.integer(m)
if ((ok<-selfrefok(x,verbose))==0L) # ok==0 so no warning when loaded from disk (-1) [-1 considered TRUE by R]
warning("Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.")
warning("Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.")
if ((ok<1L) || (truelength(x) < ncol(x)+length(newnames))) {
DT = x # in case getOption contains "ncol(DT)" as it used to. TODO: warn and then remove
n = length(newnames) + eval(getOption("datatable.alloccol")) # TODO: warn about expressions and then drop the eval()
Expand Down
1 change: 1 addition & 0 deletions R/duplicated.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@

warning_oldUniqueByKey = "The deprecated option 'datatable.old.unique.by.key' is being used. Please stop using it and pass 'by=key(DT)' instead for clarity. For more information please search the NEWS file for this option."
# upgrade the 4 calls below to error after May 2019 ( see note 10 from 1.11.0 May 2018 which said one year from then )

duplicated.data.table <- function(x, incomparables=FALSE, fromLast=FALSE, by=seq_along(x), ...) {
if (!cedta()) return(NextMethod("duplicated")) #nocov
Expand Down
2 changes: 1 addition & 1 deletion R/onLoad.R
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
"datatable.auto.index"="TRUE", # DT[col=="val"] to auto add index so 2nd time faster
"datatable.use.index"="TRUE", # global switch to address #1422
"datatable.prettyprint.char" = NULL, # FR #1091
"datatable.old.unique.by.key" = "FALSE" # TODO: change warnings in duplicated.R to error on or after Jan 2019 then remove in Jan 2020.
"datatable.old.unique.by.key" = "FALSE" # TODO: change warnings in duplicated.R to error on or after May 2019 then remove a year after that.
)
for (i in setdiff(names(opts),names(options()))) {
eval(parse(text=paste0("options(",i,"=",opts[i],")")))
Expand Down
32 changes: 14 additions & 18 deletions R/setkey.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,20 @@ setindexv <- function(x, cols, verbose=getOption("datatable.verbose")) {
}
}

set2key <- function(...) {
stop("set2key() is now deprecated. Please use setindex() instead.")
}
set2keyv <- function(...) {
stop("set2keyv() is now deprecated. Please use setindexv() instead.")
}
key2 <- function(x) {
stop("key2() is now deprecated. Please use indices() instead.")
# remove these 3 after May 2019; see discussion in #3399 and notes in v1.12.2. They were marked experimental after all.
set2key <- function(...) stop("set2key() is now deprecated. Please use setindex() instead.")
set2keyv <- function(...) stop("set2keyv() is now deprecated. Please use setindexv() instead.")
key2 <- function(...) stop("key2() is now deprecated. Please use indices() instead.")

# upgrade to error after Mar 2020. Has already been warning since 2012, and stronger warning in Mar 2019 (note in news for 1.12.2); #3399
"key<-" <- function(x,value) {
warning("key(x)<-value is deprecated and not supported. Please change to use setkey() with perhaps copy(). Has been warning since 2012 and will be an error in future.")
setkeyv(x,value)
# The returned value here from key<- is then copied by R before assigning to x, it seems. That's
# why we can't do anything about it without a change in R itself. If we return NULL (or invisible()) from this key<-
# method, the table gets set to NULL. So, although we call setkeyv(x,cols) here, and that doesn't copy, the
# returned value (x) then gets copied by R.
# So, solution is that caller has to call setkey or setkeyv directly themselves, to avoid <- dispatch and its copy.
}

setkeyv <- function(x, cols, verbose=getOption("datatable.verbose"), physical=TRUE)
Expand Down Expand Up @@ -127,16 +133,6 @@ getindex <- function(x, name) {
ans
}

"key<-" <- function(x,value) {
warning("The key(x)<-value form of setkey can copy the whole table. This is due to <- in R itself. Please change to setkeyv(x,value) or setkey(x,...) which do not copy and are faster. See help('setkey'). You can safely ignore this warning if it is inconvenient to change right now. Setting options(warn=2) turns this warning into an error, so you can then use traceback() to find and change your key<- calls.")
setkeyv(x,value)
# The returned value here from key<- is then copied by R before assigning to x, it seems. That's
# why we can't do anything about it without a change in R itself. If we return NULL (or invisible()) from this key<-
# method, the table gets set to NULL. So, although we call setkeyv(x,cols) here, and that doesn't copy, the
# returned value (x) then gets copied by R.
# So, solution is that caller has to call setkey or setkeyv directly themselves, to avoid <- dispatch and its copy.
}

haskey <- function(x) !is.null(key(x))

# reverse a vector by reference (no copy)
Expand Down
2 changes: 1 addition & 1 deletion inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -1501,7 +1501,7 @@ test(505, DT[J(a=1,b=6),sum(i.b*b),by=.EACHI]$V1, 24) # 24 now 'double' because

# Test := after a key<-
DT = data.table(a=3:1,b=4:6)
test(506, key(DT)<-"a", "a", warning="can copy the whole table")
test(506, key(DT)<-"a", "a", warning="deprecated")
test(508, DT, data.table(a=1:3,b=6:4,key="a"))
test(509, DT[,b:=10L], data.table(a=1:3,b=10L,key="a"))
test(510, DT[,c:=11L], data.table(a=1:3,b=10L,c=11L,key="a")) # Used to be warning about invalid .internal.selfref detected and fixed. As from v1.8.3 data.table() returns a NAMED==0 object, and key<- appears not to copy that. But within functions, key<- would still copy. TO DO: add tests....
Expand Down
21 changes: 21 additions & 0 deletions man/set2key.Rd
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
\name{set2key}
\alias{set2key}
\alias{set2keyv}
\alias{key2}
\alias{key<-}
\title{ Deprecated. }
\description{
These functions are deprecated. They will be removed in future. Please use the functions in \code{\link{setkey}}.
}
\usage{
set2key(...) # DEPRECATED; helpful error since May 2018 and warning since Nov 2016
set2keyv(...) # DEPRECATED; helpful error since May 2018 and warning since Nov 2016
key2(...) # DEPRECATED; helpful error since May 2018 and warning since Nov 2016
key(x) <- value # DEPRECATED; strong warning since Mar 2019 and softer warning since 2012
}
\arguments{
\item{\dots}{ Deprecated. }
\item{x}{ Deprecated. }
\item{value}{ Deprecated. }
}

Loading

0 comments on commit e192793

Please sign in to comment.