-
Notifications
You must be signed in to change notification settings - Fork 988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setDT should check if any column is POSIXlt #4800
Comments
@iago-pssjd Thank you for your report. What would be even more better is to include minimal example (so we can just copy copy to observe issue you are describing) and your session info (to check if you might be using older version of DT and issue might have been fixed already). |
Thank you @jangorecki . I updated the issue and removed the reference to |
Thank you. |
It looks like a problem of In my opinion, generally, we should not change the vector value of the object by |
@shrektan is not only a problem of |
This comment has been minimized.
This comment has been minimized.
If we change POSIXlt to POSIXct it won't be by reference anymore so I think the best is to raise error. Other operations will also warn on POSIXlt input. library(data.table)
x <- data.table(g=Sys.time(), a=1)
x[["g"]] <- as.POSIXlt(Sys.time())
str(x)
#> Classes 'data.table' and 'data.frame': 11 obs. of 2 variables:
#> $ g: POSIXlt, format: "2020-11-12 02:01:56"
#> $ a: num 1
#> - attr(*, ".internal.selfref")=<externalptr>
x[, sum(a), g]
#> Error in `[.data.table`(x, , sum(a), g): column or expression 1 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))] |
@jangorecki The example is not suitable as the warning is thrown from @iago-pssjd you're right, |
Just an addition on my initial message and example that I found now. In there, I did > x <- as.data.frame(tibble(now))
> mdt <- data.table(id=1:3, d=strptime(c("06:02:36", "06:02:48", "07:03:12"), "%H:%M:%S"))
Warning message:
In as.data.table.list(x, keep.rownames = keep.rownames, check.names = check.names, :
POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date. But if I do then: str(as.data.table(x))
Classes ‘data.table’ and 'data.frame': 11 obs. of 1 variable:
$ now: POSIXlt, format: "2021-02-26 16:07:19"
- attr(*, ".internal.selfref")=<externalptr> So |
I guess we could just reuse |
I think since #6299, we start to get nicer behavior of POSIXlt columns (?). Anyway, throwing a |
Just tested example in initial post, after > library(tibble)
> library(data.table)
> now <- as.POSIXlt(Sys.time())
> x <- as.data.frame(tibble(now))
> mdt <- data.table(id=1:3, d=strptime(c("06:02:36", "06:02:48", "07:03:12"), "%H:%M:%S"))
Warning message:
In as.data.table.list(x, keep.rownames = keep.rownames, check.names = check.names, :
POSIXlt column type detected and converted to POSIXct. We do not recommend use of POSIXlt at all because it uses 40 bytes to store one date.
> # previous is fine; but next...
> str(as.data.table(x))
Classes 'data.table' and 'data.frame': 11 obs. of 1 variable:
$ now: POSIXlt, format: Error en format.POSIXlt(object): a valid "POSIXlt" object has names
> setDT(x)
> head(x)
Error en format.POSIXlt(x, ...):
a valid "POSIXlt" object is a list of at least 9 elements |
Ah, I guess I see what you mean now. We already error for list input to setDT(list(as.POSIXlt(Sys.time())))
# Error in setDT(list(as.POSIXlt(Sys.time()))) :
# Column 1 has class 'POSIXlt'.... Line 220 in a599557
I still suspect erroring is not the best choice, so I propose moving to Found the error has been there since #646. |
Also related: #5981 |
OK I have some progress in branch https://github.com/Rdatatable/data.table/tree/setdt-posixlt After some effort, I think it's really too much work to get So rather than |
Hmm just discovered another inconsistency of M = list(matrix(1:6,ncol=2L))
setDT(M)[]
# V1 V2
# <int> <int>
# 1: 1 4
# 2: 2 5
# 3: 3 6
DF = data.frame(row.names=letters[1:3])
DF$M = matrix(1:6, ncol=2L)
setDT(DF) # issues warning
DF
# Error in dimnames(x) <- dn :
# length of 'dimnames' [1] not equal to array extent IMO the behavior of |
Actually, note that the structure now <- as.POSIXlt(Sys.time())
str(data.frame(now))
'data.frame': 1 obs. of 1 variable:
$ now: POSIXct, format: "2024-12-12 09:02:40"
str(data.frame(x=now))
'data.frame': 1 obs. of 1 variable:
$ x: POSIXct, format: "2024-12-12 09:02:40"
str(as.data.frame(now))
'data.frame': 1 obs. of 1 variable:
$ now: POSIXct, format: "2024-12-12 09:02:40" |
I was doing
setDT()
of a dataframe (or tibble) with a variablePOSIXlt
. It seems not to be any problem. Even I can dostr()
of that object. It does not show anything strange. Then errors started to appear when I tried to manipulate the object or simply dohead()
.As I have found the problem, I would ask/suggest why do not produce the error when running
setDT
or, otherwise show a warning?, which would tell about the problem that there are POSIXlt variables and they are not supported. Possibly these variables could be removed or coerced to POSIXct or another better format fordata.table
and the warning would tell about that.Update: Minimal example with session info:
Any case, thank you for this great package.
The text was updated successfully, but these errors were encountered: