Skip to content

Commit

Permalink
Fix label_number_si() to use SI prefixes (#235)
Browse files Browse the repository at this point in the history
* Edit label_number_si() to use SI prefixes

* Update test to send another argument to number()

* Fix unicode error emitted by R CMD check

* Remove non-ASCII character from docs

* Fix unicode mismatch on Windows

* Another attempt to resolve Windows unicode

* Share SI prefixes with label_bytes

* Restore whitespace

* Add billion_scale argument to label_dollar()

* Remove wikipedia hyperlink

* Rename argument as rescale_large

* Work with accuracy and scale arguments

* Clarify short scale used internationally for finance

* Refactor common code in rescale_by_suffix()

* Set default accuracy to NULL

* Fix conflicting factor levels on R 3.4

* Rename short/long scale functions

* Move SI prefixes into SI file

* label_bytes() uses rescale_by_suffix()

* label_number_si() supports scale argument

* Remove sep argument from label_number_si()
This wasn't actually doing anything, because user inputs were overwritten.

* First argument of label_number_si() is unit

* Require unit argument

* Update NEWS

* NEWS update

* Document when `scale` argument is useful

* Remove headings from NEWS

* Fix docs typo
  • Loading branch information
davidchall authored Mar 26, 2021
1 parent bb1c423 commit 9c5a00d
Show file tree
Hide file tree
Showing 20 changed files with 278 additions and 83 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,4 @@ Suggests:
Encoding: UTF-8
LazyLoad: yes
Roxygen: list(markdown = TRUE, r6 = FALSE)
RoxygenNote: 7.1.0
RoxygenNote: 7.1.1
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -132,10 +132,12 @@ export(pvalue_format)
export(reciprocal_trans)
export(regular_minor_breaks)
export(rescale)
export(rescale_long_scale)
export(rescale_max)
export(rescale_mid)
export(rescale_none)
export(rescale_pal)
export(rescale_short_scale)
export(reverse_trans)
export(scientific)
export(scientific_format)
Expand Down
22 changes: 22 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,28 @@
* `manual_pal()` now always returns an unnamed colour vector, which is easy to
use with `ggplot2::discrete_scale()` (@yutannihilation, #284).

* `label_number_si()` now correctly uses [SI prefixes](https://en.wikipedia.org/wiki/Metric_prefix)
(e.g. abbreviations "k" for "kilo-" and "m" for "milli-"). It previously used
[short scale abbreviations](https://en.wikipedia.org/wiki/Long_and_short_scales)
(e.g. "M" for million, "B" for billion). The short scale is most commonly used
in finance, so it is now supported via the new `rescale_large` argument of
`label_dollar()` (@davidchall, #235).

* `label_number_si()` now requires the `unit` argument is specified. The default
value of the `accuracy` argument is now `NULL`, which automatically chooses
the precision. The `sep` argument is removed, which had no purpose (@davidchall, #235).

* `label_dollar()` gains a `rescale_large` argument to support scaling of large
numbers by suffix (e.g. "M" for million, "B" for billion). In finance, the
short scale is most prevalent (i.e. 1 billion = 1 thousand million). In other
contexts, the long scale might be desired (i.e. 1 billion = 1 million million).
These two common scales are supported by setting `rescale_large = rescale_short_scale()`
or `rescale_large = rescale_long_scale()`, but custom scaling-by-suffix is also
supported (@davidchall, #235).

* `label_bytes()` now correctly accounts for the `scale` argument when choosing
auto units (@davidchall, #235).

# scales 1.1.1

* `breaks_width()` now handles `difftime`/`hms` objects (@bhogan-mitre, #244).
Expand Down
29 changes: 13 additions & 16 deletions R/label-bytes.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Label bytes (1 kb, 2 MB, etc)
#' Label bytes (1 kB, 2 MB, etc)
#'
#' Scale bytes into human friendly units. Can use either SI units (e.g.
#' kB = 1000 bytes) or binary units (e.g. kiB = 1024 bytes). See
Expand All @@ -10,7 +10,7 @@
#' SI units (base 1000).
#' * "kiB", "MiB", "GiB", "TiB", "PiB", "EiB", "ZiB", and "YiB" for
#' binary units (base 1024).
#' * `auto_si` or `auto_binary` to automatically pick the most approrpiate
#' * `auto_si` or `auto_binary` to automatically pick the most appropriate
#' unit for each value.
#' @inheritParams number_format
#' @param ... Other arguments passed on to [number()]
Expand All @@ -37,7 +37,7 @@
#' breaks = breaks_width(250 * 1024),
#' label = label_bytes("auto_binary")
#' )
label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
label_bytes <- function(units = "auto_si", accuracy = 1, scale = 1, ...) {
stopifnot(is.character(units), length(units) == 1)
force_all(accuracy, ...)

Expand All @@ -48,8 +48,10 @@ label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
base <- switch(units, auto_binary = 1024, auto_si = 1000)
suffix <- switch(units, auto_binary = "iB", auto_si = "B")

power <- findInterval(abs(x), c(0, base^powers)) - 1L
units <- paste0(c("", names(powers))[power + 1L], suffix)
rescale <- rescale_by_suffix(x * scale, breaks = c(0, base^powers))

suffix <- paste0(" ", rescale$suffix, suffix)
scale <- scale * rescale$scale
} else {
si_units <- paste0(names(powers), "B")
bin_units <- paste0(names(powers), "iB")
Expand All @@ -63,22 +65,17 @@ label_bytes <- function(units = "auto_si", accuracy = 1, ...) {
} else {
stop("'", units, "' is not a valid unit", call. = FALSE)
}

suffix <- paste0(" ", units)
scale <- scale / base^power
}

number(
x / base^power,
x,
accuracy = accuracy,
suffix = paste0(" ", units),
scale = scale,
suffix = suffix,
...
)
}
}

# Helpers -----------------------------------------------------------------

si_powers <- (-8:8) * 3
names(si_powers) <- c(
rev(c("m", "\u00b5", "n", "p", "f", "a", "z", "y")), "",
"k", "M", "G", "T", "P", "E", "Z", "Y"
)
si_powers
56 changes: 52 additions & 4 deletions R/label-dollar.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@
#' value is less than `largest_with_cents` which by default is 100,000.
#' @param prefix,suffix Symbols to display before and after value.
#' @param negative_parens Display negative using parentheses?
#' @param rescale_large Named list indicating suffixes given to large values
#' (e.g. thousands, millions, billions, trillions). Name gives suffix, and
#' value specifies the power-of-ten. The two most common scales are provided
#' (`rescale_short_scale()` and `rescale_long_scale()`).
#' If `NULL`, the default, these suffixes aren't used.
#' @param ... Other arguments passed on to [base::format()].
#' @export
#' @family labels for continuous scales
Expand All @@ -23,7 +28,7 @@
#'
#' # Customise currency display with prefix and suffix
#' demo_continuous(c(1, 100), labels = label_dollar(prefix = "USD "))
#' euro <- dollar_format(
#' euro <- label_dollar(
#' prefix = "",
#' suffix = "\u20ac",
#' big.mark = ".",
Expand All @@ -33,10 +38,26 @@
#'
#' # Use negative_parens = TRUE for finance style display
#' demo_continuous(c(-100, 100), labels = label_dollar(negative_parens = TRUE))
#'
#' # In finance the short scale is most prevalent
#' dollar <- label_dollar(rescale_large = rescale_short_scale())
#' demo_log10(c(1, 1e18), breaks = log_breaks(7, 1e3), labels = dollar)
#'
#' # In other contexts the long scale might be used
#' long <- label_dollar(prefix = "", rescale_large = rescale_long_scale())
#' demo_log10(c(1, 1e18), breaks = log_breaks(7, 1e3), labels = long)
#'
#' # You can also define a custom naming scheme
#' gbp <- label_dollar(
#' prefix = "\u00a3",
#' rescale_large = c(k = 3L, m = 6L, bn = 9L, tn = 12L)
#' )
#' demo_log10(c(1, 1e12), breaks = log_breaks(5, 1e3), labels = gbp)
label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
suffix = "", big.mark = ",", decimal.mark = ".",
trim = TRUE, largest_with_cents = 100000,
negative_parens = FALSE, ...) {
negative_parens = FALSE, rescale_large = NULL,
...) {
force_all(
accuracy,
scale,
Expand All @@ -47,6 +68,7 @@ label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
trim,
largest_with_cents,
negative_parens,
rescale_large,
...
)
function(x) dollar(
Expand All @@ -60,6 +82,7 @@ label_dollar <- function(accuracy = NULL, scale = 1, prefix = "$",
trim = trim,
largest_with_cents = largest_with_cents,
negative_parens,
rescale_large = rescale_large,
...
)
}
Expand All @@ -86,9 +109,10 @@ dollar_format <- label_dollar
dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",
suffix = "", big.mark = ",", decimal.mark = ".",
trim = TRUE, largest_with_cents = 100000,
negative_parens = FALSE, ...) {
negative_parens = FALSE, rescale_large = NULL,
...) {
if (length(x) == 0) return(character())
if (is.null(accuracy)) {
if (is.null(accuracy) && is.null(rescale_large)) {
if (needs_cents(x * scale, largest_with_cents)) {
accuracy <- .01
} else {
Expand All @@ -102,6 +126,18 @@ dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",
negative <- !is.na(x) & x < 0
x <- abs(x)

if (!is.null(rescale_large)) {
if (!(is.integer(rescale_large) && all(rescale_large > 0))) {
stop("`rescale_large` must be positive integers.", call. = FALSE)
}

rescale <- rescale_by_suffix(x * scale, breaks = c(0, 10^rescale_large))

sep <- if (suffix == "") "" else " "
suffix <- paste0(rescale$suffix, sep, suffix)
scale <- scale * rescale$scale
}

amount <- number(
x,
accuracy = accuracy,
Expand All @@ -126,3 +162,15 @@ dollar <- function(x, accuracy = NULL, scale = 1, prefix = "$",

amount
}

#' @export
#' @rdname label_dollar
rescale_short_scale <- function() {
c(K = 3L, M = 6L, B = 9L, T = 12L)
}

#' @export
#' @rdname label_dollar
rescale_long_scale <- function() {
c(K = 3L, M = 6L, B = 12L, T = 18L)
}
61 changes: 35 additions & 26 deletions R/label-number-si.R
Original file line number Diff line number Diff line change
@@ -1,46 +1,55 @@
#' Label numbers with SI prefixes (2k, 1M, 5T etc)
#' Label numbers with SI prefixes (2 kg, 5 mm, etc)
#'
#' `number_si()` automatically scales and labels with the best SI prefix,
#' "K" for values \eqn{\ge} 10e3, "M" for \eqn{\ge} 10e6,
#' "B" for \eqn{\ge} 10e9, and "T" for \eqn{\ge} 10e12.
#' `label_number_si()` automatically adds the most suitable SI prefix and scales
#' the values appropriately. For example, values greater than 1000 gain a "k"
#' prefix (abbreviated from "kilo-") and are scaled by 1/1000.
#' See [Metric Prefix](https://en.wikipedia.org/wiki/Metric_prefix) on Wikipedia
#' for more details.
#'
#' @inherit number_format return params
#' @param unit Optional units specifier.
#' @param sep Separator between number and SI unit. Defaults to `" "` if
#' `units` is supplied, and `""` if not.
#' @param unit Unit of measurement (e.g. `"m"` for meter, the SI unit of length).
#' @param scale A scaling factor: `x` will be multiplied by `scale` before
#' formatting. This is useful if the underlying data is already using an SI
#' prefix.
#' @export
#' @family labels for continuous scales
#' @family labels for log scales
#' @examples
#' demo_continuous(c(1, 1e9), label = label_number_si())
#' demo_continuous(c(1, 5000), label = label_number_si(unit = "g"))
#' demo_continuous(c(1, 1000), label = label_number_si(unit = "m"))
#' demo_continuous(c(1, 1000), labels = label_number_si("m"))
#'
#' demo_log10(c(1, 1e9), breaks = log_breaks(10), labels = label_number_si())
label_number_si <- function(accuracy = 1, unit = NULL, sep = NULL, ...) {
sep <- if (is.null(unit)) "" else " "
#' demo_log10(c(1, 1e9), breaks = log_breaks(10), labels = label_number_si("m"))
#' demo_log10(c(1e-9, 1), breaks = log_breaks(10), labels = label_number_si("g"))
#'
#' # use scale when data already uses SI prefix (e.g. stored in kg)
#' kg <- label_number_si("g", scale = 1e3)
#' demo_log10(c(1e-9, 1), breaks = log_breaks(10), labels = kg)
label_number_si <- function(unit, accuracy = NULL, scale = 1, ...) {
sep <- if (is.null(unit) || !nzchar(unit)) "" else " "
force_all(accuracy, ...)

function(x) {
breaks <- c(0, 10^c(K = 3, M = 6, B = 9, T = 12))

n_suffix <- cut(abs(x),
breaks = c(unname(breaks), Inf),
labels = c(names(breaks)),
right = FALSE
)
n_suffix[is.na(n_suffix)] <- ""
suffix <- paste0(sep, n_suffix, unit)
rescale <- rescale_by_suffix(x * scale, breaks = 10^si_powers)

scale <- 1 / breaks[n_suffix]
# for handling Inf and 0-1 correctly
scale[which(scale %in% c(Inf, NA))] <- 1
suffix <- paste0(sep, rescale$suffix, unit)
scale <- scale * rescale$scale

number(x,
accuracy = accuracy,
scale = unname(scale),
scale = scale,
suffix = suffix,
...
)
}
}

# power-of-ten prefixes used by the International System of Units (SI)
# https://www.bipm.org/en/measurement-units/prefixes.html
#
# note: irregular prefixes (hecto, deca, deci, centi) are not stored
# because they don't commonly appear in scientific usage anymore
si_powers <- (-8:8) * 3
names(si_powers) <- c(
rev(c("m", "\u00b5", "n", "p", "f", "a", "z", "y")), "",
"k", "M", "G", "T", "P", "E", "Z", "Y"
)
si_powers
2 changes: 1 addition & 1 deletion R/label-number.r
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
#'
#' Applied to rescaled data.
#' @param scale A scaling factor: `x` will be multiplied by `scale` before
#' formating. This is useful if the underlying data is very small or very
#' formatting. This is useful if the underlying data is very small or very
#' large.
#' @param prefix,suffix Symbols to display before and after value.
#' @param big.mark Character used between every 3 digits to separate thousands.
Expand Down
20 changes: 20 additions & 0 deletions R/rescale_by_suffix.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# each value of x is assigned a suffix and associated scaling factor
rescale_by_suffix <- function(x, breaks) {
suffix <- as.character(cut(
abs(x),
breaks = c(unname(breaks), Inf),
labels = names(breaks),
right = FALSE
))
suffix[is.na(suffix)] <- names(which.min(breaks))

scale <- unname(1 / breaks[suffix])
scale[which(scale %in% c(Inf, NA))] <- 1

# exact zero is not scaled
x_zero <- which(abs(x) == 0)
scale[x_zero] <- 1
suffix[x_zero] <- ""

list(scale = scale, suffix = suffix)
}
2 changes: 1 addition & 1 deletion man/brewer_pal.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions man/label_bytes.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 9c5a00d

Please sign in to comment.