Skip to content

Commit

Permalink
Update vignettes/datatable-benchmarking.Rmd
Browse files Browse the repository at this point in the history
Co-authored-by: Michael Chirico <[email protected]>
  • Loading branch information
davidbudzynski and MichaelChirico authored Nov 3, 2023
1 parent fff9f7c commit 5744534
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vignettes/datatable-benchmarking.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ sudo lshw -class disk
sudo hdparm -t /dev/sda
```

When comparing `fread` to non-R solutions be aware that R requires values of character columns to be added to _R's global string cache_. This takes time when reading data but later operations benefit since the character strings have already been cached. Consequently, as well as timing isolated tasks (such as `fread` alone), it's a good idea to benchmark a pipeline of tasks such as reading data, computing operators and producing final output and report the total time of the pipeline.
When comparing `fread` to non-R solutions be aware that R requires values of character columns to be added to _R's global string cache_. This takes time when reading data but later operations benefit since the character strings have already been cached. Consequently, in addition to timing isolated tasks (such as `fread` alone), it's a good idea to benchmark the total time of an end-to-end pipeline of tasks such as reading data, manipulating it, and producing final output.

# subset: threshold for index optimization on compound queries

Expand Down

0 comments on commit 5744534

Please sign in to comment.