Skip to content

Commit

Permalink
docs+test: cache docs and tests
Browse files Browse the repository at this point in the history
  • Loading branch information
dsweber2 committed Sep 1, 2023
1 parent c9abe5d commit f4d0584
Show file tree
Hide file tree
Showing 11 changed files with 235 additions and 35 deletions.
36 changes: 21 additions & 15 deletions R/cache.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,17 @@
cache_environ <- new.env(parent = emptyenv())
cache_environ$use_cache <- NULL
cache_environ$epidatr_cache <- NULL
#' create a new cache for this session
#' create or renew a cache for this session
#'
#' @description
#' `set_cache` (re)defines the cache to use. This does not clear existing data at any previous location, but defines a new access for this R session.
#' Say your cache is normally stored in the default directory, but for the current session you want to save your results in `~/my/temporary/savedirectory`, then you would call `set_cache(dir = "~/my/temporary/savedirectory")`.
#' Or if you know the data from 2 days ago is wrong, you could call `set_cache(days = 1)` to clear older data. In both cases, these changes would only last for a single session.
#' In general, it is better to set your preferences via environmental variables in your `.Renviron` folder, with the corresponding variables listed in the arguments section below.
#' In addition to those, there is the `EPIDATR_USE_CACHE` environmental variable, which unless defined to be `TRUE` otherwise defaults to `FALSE`.
#' By default, epidatr re-requests data from the API on every call of `fetch`. In case you find yourself repeatedly calling the same data, you can enable the cache using either this function for a given session, or environmental variables for a persistent cache.
#' The typical recommended workflow for using the cache is to set the environmental variables `EPIDATR_USE_CACHE=TRUE` and `EPIDATR_CACHE_DIRECTORY="/your/directory/here"`in your `.Renviron`, for example by calling `usethis::edit_r_environ()`.
#' See the parameters below for some more configurables if you're so inclined.
#'
#' `set_cache` (re)defines the cache to use in a particular R session. This does not clear existing data at any previous location, but instead creates a handle to the new cache using [cachem](https://cachem.r-lib.org/index.html) that seamlessly handles caching for you.
#' Say your cache is normally stored in some default directory, but for the current session you want to save your results in `~/my/temporary/savedirectory`, then you would call `set_cache(dir = "~/my/temporary/savedirectory")`.
#' Or if you know the data from 2 days ago is wrong, you could call `set_cache(days = 1)` to clear older data whenever the cache is referenced.
#' In both cases, these changes would only last for a single session (though the deleted data would be gone permanently!).
#'
#' An important feature of the caching in this package is that only calls which specify either `issues` before a certain date, or `as_of` before a certain date will actually cache. For example the call
#' ```
Expand All @@ -35,7 +38,7 @@ cache_environ$epidatr_cache <- NULL
#' as_of = "2023-08-01"
#' )
#' ```
#' *will* cache, since normal new versions of data can't invalidate it. It is still possible that Delphi may patch such data, but the frequency is on the order of months rather than days. We are working on creating a public channel to communicate such updates. Stars for `issues` won't cache, since they're subject to cache invalidation by normal versioning.
#' *will* cache, since normal new versions of data can't invalidate it (since they would be `as_of` a later date). It is still possible that Delphi may patch such data, but the frequency is on the order of months rather than days. We are working on creating a public channel to communicate such updates. While specifying `issues` will usually cache, a call with `issues="*"` won't cache, since its subject to cache invalidation by normal versioning.
#'
#' On the backend, the cache uses cachem, with filenames generated using an md5 encoding of the call url. Each file corresponds to a unique epidata-API call.
#' @examples
Expand All @@ -49,12 +52,13 @@ cache_environ$epidatr_cache <- NULL
#' )
#' }
#'
#' @param cache_dir the directory in which the cache is stored. By default, this is `tools::R_user_dir()` if on R 4.0+, but must be specified for earlier versions of R. The path can be either relative or absolute. The environmental variable is `EPIDATR_CACHE_DIR`
#' @param days the maximum length of time in days to keep any particular cached call. By default this is `1`. The environmental variable is `EPIDATR_CACHE_MAX_AGE_DAYS`
#' @param cache_dir the directory in which the cache is stored. By default, this is `tools::R_user_dir()` if on R 4.0+, but must be specified for earlier versions of R. The path can be either relative or absolute. The environmental variable is `EPIDATR_CACHE_DIR`.
#' @param days the maximum length of time in days to keep any particular cached call. By default this is `1`. The environmental variable is `EPIDATR_CACHE_MAX_AGE_DAYS`.
#' @param max_size the size of the entire cache, in MB, at which to start pruning entries. By default this is `1024`, or 1GB. The environmental variable is `EPIDATR_CACHE_MAX_SIZE_MB`.
#' @param logfile where cachem's log of transactions is stored, relative to the cache directory. By default, it is `"logfile.txt"`. The environmental variable is `EPIDATR_CACHE_LOGFILE`.
#' @param prune_rate how many calls to go between checking if any cache elements are too old or if the cache overall is too large. Defaults to `2000L`. Since cachem fixes the max time between prune checks to 5 seconds, there's little reason to actually change this parameter. Doesn't have a corresponding environmental variable.
#' @param confirm whether to confirm directory creation. default is `TRUE`; should only be set in scripts
#' @param confirm whether to confirm directory creation. default is `TRUE`; should only be set in non-interactive scripts
#' @seealso [clear_cache] to delete the old cache while making a new one, [disable_cache] to disable without deleting, and [cache_info]
#' @export
#' @import cachem
set_cache <- function(cache_dir = NULL,
Expand Down Expand Up @@ -122,9 +126,7 @@ set_cache <- function(cache_dir = NULL,
}
}


#' manually reset the cache, deleting all currently saved data and starting afresh
#'
#' @description
#' deletes the current cache and resets a new cache. Deletes local data! If you are using a session unique cache, you will have to pass the arguments you used for `set_cache` earlier, otherwise the system-wide `.Renviron`-based defaults will be used.
#' @examples
Expand All @@ -137,7 +139,6 @@ set_cache <- function(cache_dir = NULL,
#' prune_rate = 20L
#' )
#' }
#'
#' @param disable instead of setting a new cache, disable caching entirely; defaults to `FALSE`
#' @inheritParams set_cache
#' @seealso [set_cache] to start a new cache (and general caching info), [disable_cache] to only disable without deleting, and [cache_info]
Expand All @@ -154,19 +155,24 @@ clear_cache <- function(disable = FALSE, ...) {
#' @description
#' Disable caching until you call `set_cache` or restart R. The files defining the cache are untouched. If you are looking to disable the caching more permanently, set `EPIDATR_USE_CACHE=FALSE` as environmental variable in your `.Renviron`.
#' @export
#' @seealso [set_cache] to start a new cache (and general caching info), [clear_cache] to delete the cache and set a new one, and [cache_info]
#' @import cachem
disable_cache <- function() {
cache_environ$epidatr_cache <- NULL
}

#' turn off the caching for this session
#' describe current cache
#' @description
#' Print out the information about the cache (as would be returned by cachem's `info()` method)
#' @examples
#' cache_info()
#' @seealso [set_cache] to start a new cache (and general caching info), [clear_cache] to delete the cache and set a new one, and [disable_cache] to disable without deleting
#' @export
cache_info <- function() {
cache_environ$epidatr_cache$info()
}

#' create a new cache for this session
#' dispatch caching
#'
#' @description
#' the guts of caching, its interposed between fetch and the specific fetch methods. Internal method only.
Expand Down
6 changes: 6 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@ reference:
- avail_endpoints
- epirange
- timeset
- title: Control Caching behavior
- contents:
- set_cache
- clear_cache
- disable_cache
- cache_info
- title: Make API requests
desc: Query Delphi Epidata endpoints
- contents:
Expand Down
2 changes: 1 addition & 1 deletion man/cache_epidata_call.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 7 additions & 1 deletion man/cache_info.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions man/check_is_cachable.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions man/check_is_recent.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 7 additions & 2 deletions man/clear_cache.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions man/disable_cache.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 19 additions & 10 deletions man/set_cache.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions tests/testthat/setup.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
current_cache <- cache_environ$epidatr_cache
disable_cache()
withr::defer(cache_environ$epidatr_cache <- current_cache, teardown_env())
Loading

0 comments on commit f4d0584

Please sign in to comment.