From b0036bcaa0c964b9e1ec9bc5f055794bd88f9bf8 Mon Sep 17 00:00:00 2001 From: David Budzynski Date: Sun, 5 Jun 2022 12:19:35 +0100 Subject: [PATCH] fix typos and adhere to British English spelling --- vignettes/datatable-keys-fast-subset.Rmd | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/vignettes/datatable-keys-fast-subset.Rmd b/vignettes/datatable-keys-fast-subset.Rmd index 917a90413..1a7cb37ef 100644 --- a/vignettes/datatable-keys-fast-subset.Rmd +++ b/vignettes/datatable-keys-fast-subset.Rmd @@ -19,7 +19,7 @@ knitr::opts_chunk$set( collapse = TRUE) ``` -This vignette is aimed at those who are already familiar with *data.table* syntax, its general form, how to subset rows in `i`, select and compute on columns, add/modify/delete columns *by reference* in `j` and group by using `by`. If you're not familiar with these concepts, please read the *"Introduction to data.table"* and *"Reference semantics"* vignettes first. +This vignette is aimed at those who are already familiar with *data.table* syntax, its general form, how to subset rows in `i`, select and compute on columns, add/modify/delete columns *by reference* in `j` and group by using `by`. If you're not familiar with these concepts, please read the *"Introduction to data.table"* and *"Reference semantics"* vignettes first. *** @@ -146,7 +146,7 @@ head(flights) #### set* and `:=`: -In *data.table*, the `:=` operator and all the `set*` (e.g., `setkey`, `setorder`, `setnames` etc..) functions are the only ones which modify the input object *by reference*. +In *data.table*, the `:=` operator and all the `set*` (e.g., `setkey`, `setorder`, `setnames` etc...) functions are the only ones which modify the input object *by reference*. Once you *key* a *data.table* by certain columns, you can subset by querying those key columns using the `.()` notation in `i`. Recall that `.()` is an *alias to* `list()`. @@ -238,7 +238,7 @@ flights[.(unique(origin), "MIA")] #### What's happening here? -* Read [this](#multiple-key-point) again. The value provided for the second key column *"MIA"* has to find the matching values in `dest` key column *on the matching rows provided by the first key column `origin`*. We can not skip the values of key columns *before*. Therefore we provide *all* unique values from key column `origin`. +* Read [this](#multiple-key-point) again. The value provided for the second key column *"MIA"* has to find the matching values in `dest` key column *on the matching rows provided by the first key column `origin`*. We can not skip the values of key columns *before*. Therefore, we provide *all* unique values from key column `origin`. * *"MIA"* is automatically recycled to fit the length of `unique(origin)` which is *3*. @@ -307,7 +307,7 @@ key(flights) * And on those row indices, we replace the `key` column with the value `0`. -* Since we have replaced values on the *key* column, the *data.table* `flights` isn't sorted by `hour` any more. Therefore, the key has been automatically removed by setting to NULL. +* Since we have replaced values on the *key* column, the *data.table* `flights` isn't sorted by `hour` anymore. Therefore, the key has been automatically removed by setting to NULL. Now, there shouldn't be any *24* in the `hour` column. @@ -393,7 +393,7 @@ flights[origin == "JFK" & dest == "MIA"] One advantage very likely is shorter syntax. But even more than that, *binary search based subsets* are **incredibly fast**. -As the time goes `data.table` gets new optimization and currently the latter call is automatically optimized to use *binary search*. +As the time goes `data.table` gets new optimisation and currently the latter call is automatically optimized to use *binary search*. To use slow *vector scan* key needs to be removed. ```{r eval = FALSE}