Skip to content

Commit

Permalink
fix typos
Browse files Browse the repository at this point in the history
  • Loading branch information
davidbudzynski committed Jun 5, 2022
1 parent dcd1093 commit 4975bfe
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions vignettes/datatable-secondary-indices-and-auto-indexing.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ b) reordering the entire data.table, by reference, based on the order vector com

#

Computing the order isn't the time consuming part, since data.table uses true radix sorting on integer, character and numeric vectors. However reordering the data.table could be time consuming (depending on the number of rows and columns).
Computing the order isn't the time consuming part, since data.table uses true radix sorting on integer, character and numeric vectors. However, reordering the data.table could be time consuming (depending on the number of rows and columns).

Unless our task involves repeated subsetting on the same column, fast key based subsetting could effectively be nullified by the time to reorder, depending on our data.table dimensions.

Expand Down Expand Up @@ -147,7 +147,7 @@ As we will see in the next section, the `on` argument provides several advantage

* allows for a cleaner syntax by having the columns on which the subset is performed as part of the syntax. This makes the code easier to follow when looking at it at a later point.

Note that `on` argument can also be used on keyed subsets as well. In fact, we encourage to provide the `on` argument even when subsetting using keys for better readability.
Note that `on` argument can also be used on keyed subsets as well. In fact, we encourage providing the `on` argument even when subsetting using keys for better readability.

#

Expand Down Expand Up @@ -276,7 +276,7 @@ flights[.(c("LGA", "JFK", "EWR"), "XNA"), mult = "last", on = c("origin", "dest"

## 3. Auto indexing

First we looked at how to fast subset using binary search using *keys*. Then we figured out that we could improve performance even further and have more cleaner syntax by using secondary indices.
First we looked at how to fast subset using binary search using *keys*. Then we figured out that we could improve performance even further and have cleaner syntax by using secondary indices.

That is what *auto indexing* does. At the moment, it is only implemented for binary operators `==` and `%in%`. An index is automatically created *and* saved as an attribute. That is, unlike the `on` argument which computes the index on the fly each time (unless one already exists), a secondary index is created here.

Expand Down

0 comments on commit 4975bfe

Please sign in to comment.