Skip to content

Commit

Permalink
add list column example in intro vignette (#6558)
Browse files Browse the repository at this point in the history
  • Loading branch information
gurbuxanink authored Dec 3, 2024
1 parent 6be4cdd commit 98cf24e
Showing 1 changed file with 20 additions and 0 deletions.
20 changes: 20 additions & 0 deletions vignettes/datatable-intro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,26 @@ DT[, print(list(c(a,b))), by = ID] # (2)

In (1), for each group, a vector is returned, with length = 6,4,2 here. However, (2) returns a list of length 1 for each group, with its first element holding vectors of length 6,4,2. Therefore, (1) results in a length of ` 6+4+2 = `r 6+4+2``, whereas (2) returns `1+1+1=`r 1+1+1``.

Flexibility of j allows us to store any list object as an element of data.table. For example, when statistical models are fit to groups, these models can be stored in a data.table. Code is concise and easy to understand.

```{r}
## Do long distance flights cover up departure delay more than short distance flights?
## Does cover up vary by month?
flights[, `:=`(makeup = dep_delay - arr_delay)]
makeup.models <- flights[, .(fit = list(lm(makeup ~ distance))), by = .(month)]
makeup.models[, .(coefdist = coef(fit[[1]])[2], rsq = summary(fit[[1]])$r.squared), by = .(month)]
```
Using data.frames, we need more complicated code to obtain same result.
```{r}
setDF(flights)
flights.split <- split(flights, f = flights$month)
makeup.models.list <- lapply(flights.split, function(df) c(month = df$month[1], fit = list(lm(makeup ~ distance, data = df))))
makeup.models.df <- do.call(rbind, makeup.models.list)
sapply(makeup.models.df[, "fit"], function(model) c(coefdist = coef(model)[2], rsq = summary(model)$r.squared)) |> t() |> data.frame()
setDT(flights)
```

## Summary

The general form of `data.table` syntax is:
Expand Down

0 comments on commit 98cf24e

Please sign in to comment.