Skip to content

Commit

Permalink
release 2.7.4 on CRAN
Browse files Browse the repository at this point in the history
  • Loading branch information
strengejacke committed Aug 5, 2018
1 parent 33b31a2 commit 880cdc2
Show file tree
Hide file tree
Showing 72 changed files with 136 additions and 688 deletions.
16 changes: 8 additions & 8 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ Package: sjmisc
Type: Package
Encoding: UTF-8
Title: Data and Variable Transformation Functions
Version: 2.7.3.9000
Date: 2018-08-02
Version: 2.7.4
Date: 2018-08-04
Authors@R: person("Daniel", "Lüdecke", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0002-8895-3206"))
Maintainer: Daniel Lüdecke <[email protected]>
Description: Collection of miscellaneous utility functions, supporting data
Expand All @@ -17,9 +17,9 @@ Depends:
stats,
utils
Imports:
broom (>= 0.5.0),
broom (>= 0.4.5),
crayon,
dplyr (>= 0.7.1),
dplyr,
haven (>= 1.1.2),
magrittr,
pillar,
Expand All @@ -28,15 +28,15 @@ Imports:
sjlabelled (>= 1.0.12),
stringdist (>= 0.9.4),
stringr (>= 1.2.0),
tibble (>= 1.4.1),
tidyr (>= 0.7.0),
tibble,
tidyr,
tidyselect
Suggests:
ggplot2,
graphics,
mice,
sjPlot (>= 2.4.0),
sjstats (>= 0.13.0),
sjPlot,
sjstats,
knitr,
rmarkdown,
testthat
Expand Down
186 changes: 0 additions & 186 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,189 +207,3 @@ The recoding and transformation functions get scoped variants, allowing to selec
* `add_columns()` and `replace_columns()` crashed R when no data frame was specified in `...`-ellipses argument.
* `descr()` and `frq()` used wrong variable labels when processing grouped data frames for specific situations, where the grouping variable had no sequences values.
* `descr()` did not work for large data frames, because internally, because `psych::describe()` switched to fast mode by default then (removing columns from the output).

# sjmisc 2.4.0

## General

* Argument `value` in `set_na()` is deprecated. Please use `na` instead.
* Argument `recodes` in `rec()` is deprecated. Please use `rec` instead.
* Argument `lab` in `set_label()` is deprecated. Please use `label` instead.
* Argument `value` in `add_labels()` and `replace_labels()` is deprecated. Please use `labels` instead.
* Argument `value` in `ref_lvl()` is deprecated. Please use `lvl` instead.

## New functions

* `row_sums()` as wrapper of `rowSums()` to compute row sums, but works within pipe-workflow and with select-helpers for variables, and always returns a tibble..
* `row_means()` as wrapper of `sjstats::mean_n()` to compute row means, but works within pipe-workflow and with select-helpers for variables, and always returns a tibble..
* `%nin%` as complement to `%in%`.

## Changes to functions

* Functions `rec()`, `dicho()`, `center()`, `std()`, `recode_to()` and `group_var()` get an `append`-argument, to optionally return the original data including the transformed variables as new columns.
* `var_labels()` and `var_rename()` now give a warning if specified variables to rename or relabel do not exist in the data frame. Non-matching variables are ignored.
* If `model.term` does not exist in models, `spread_coef()` now prints the name of non-existing coefficients.
* `find_var()` gets a `fuzzy`-argument to enable fuzzy-matching for search pattern.

## Bug fixes

* `remove_empty_cols()` returned an empty data frame, when input data frame had no empty columns.
* `remove_empty_rows()` returned an empty data frame, when input data frame had no empty rows.
* `add_columns()` and `replace_columns()` in some cases coerced data frames of class `data.frame` with only one column into a vector, which gave an error when binding columns.
* Argument `part.dist.match` in `str_pos()` caused an error when being larger than 0.

# sjmisc 2.3.1

## General

* Re-exports `magrittr::%>%` (Bob Rudis said more packages should do so).

## New functions

* `replace_columns()` to replace variables in one data frame with variables from other data frames.

## Changes to functions

* `descr()` gets a `max.length`-argument to shorten variable labels in the output to a specific number of chars.
* `descr()` now also reports the percentage of missing values.
* `set_na()` no longer gives a warning when trying to replace values with `NA` for vectors that completely contained `NA`s.

## Bug fixes

* `descr()` now also works on single vectors as data argument.
* Fixed bugs with `write_*()`-functions.

# sjmisc 2.3.0

## General

* Added package-vignettes.
* Functions were largely revised to work seamlessly within the tidyverse. This may break existing code, but in the long run, consistent api-design makes working with the package more intuitive. See `vignette("design_philosophy", "sjmisc")` for more details.
* `as_labelled()` only converts vectors into `labelled`-class if vector has label attributes. This ensures that data can be properly saved into other formats, e.g. with `write_spss()`.
* The `write_*()`-functions have been revised and should now save data frame with any common classes of vectors (labelled, factor, numeric, atomic...).

## New functions

* `center()` and `std()` are moving from package `sjstats` to `sjmisc`.
* `add_columns()` to bind columns of first data frame at the end of all data frames.

## Changes to functions

* `frq()` now has the same argument-structure as `flat_table()`.
* Following functions now follow a consistent tidyverse-approach, with the data being the first argument, followed by variable names: `add_labels()`, `replace_labels()`, `remove_labels()`, `count_na()`, `rec()`, `dicho()`, `split_var()`, `drop_labels()`, `fill_labels()`, `group_var()`, `group_labels()`, `ref_lvl()`, `recode_to()`, `replace_na()`, `set_na()` and `set_labels()`.
* `get_values()` now sorts returned values by default, to be consistent with `get_labels()`.
* `spread_coef()` gets arguments `se` and `p.val`, to define whether standard errors and p-values should be included in the return value or not.

## Bug fixes

* `merge_df()` did not copy label attributes for data frame with identical variables (that were row-bound).
* `to_value()` did not work for character vectors of class labelled, with empty values (which typically have no value label).

# sjmisc 2.2.1

## Bug fixes

* The `sort.frq` did not work `frq()`.

# sjmisc 2.2.0

## New functions

* `zap_inf()` to "clean" vectors from `NaN` and infinite values.
* `descr()` to provide basic descriptive statistics (similar to `describe()` in the psych-package), but including variable labels and usable in pipe-workflows. Also works with grouped data frames.

## Changes to functions

* `rec()`, `split_var()` and `dicho()` get an argument `suffix`, to append a suffix to variable (column) names, if applied on a data frame.
* Value labels in `rec()` can now directly be assigned inside the `recodes`-syntax (see 'Details' in `?rec`).
* `find_var()` gets a `as.df`-argument, to return a data frame with matching variables, instead of their column indices only.
* `find_var()` gets a `as.varlab`-argument, to return a "summary" data frame with column number, variable name and variable label.
* `flat_table()` now also accepts grouped data frames.
* `flat_table()` gets a `show.values`-argument, to add values to associated labels in output.
* `frq()` now also accepts grouped data frames.
* `frq()` gets a `weight.by`-argument to weight frequencies.
* `set_na()` can now also find values by their value labels and replace them with NA.
* `set_na()` now removes unused value labels from values that have been replaced with NA.
* The `as.tag`-argument in `set_na()` now defaults to `FALSE`.
* `get_labels()` now always returns labels in sorted order of the associated values.
* `get_labels()` gets a `drop.unused`-argument, to automatically drop labels from values that don't occur in the vector.
* For a named vector as `labels`-argument, `set_labels()` now always sorts labels in sorted order of the associated values.
* `is_empty()` gets a `first.only`-argument, to evaluate either first or all elements of a character vector.

## Bug fixes

* `set_na()` did not work on vectors of class `Date` when argument `as.tag = TRUE`.
* `flat_table()` did not show values that had no value labels. Now all categories are shown in the frequency table.
* `rec()` did not properly copy labels of tagged NA values when not all recoded values appeared in the vector.
* `frq()` did not show correct values, when value labels of a vector were not sorted according their values.
* `set_labels()` did not set labels properly for ordered factors.
* `remove_labels()` returned NA-values for value labels (instead of no value labels) when the last value label of a vector was removed.


# sjmisc 2.1.0

## New functions

* `find_var()` to find variables in data frames by name or label.
* `var_labels()` as "tidyversed" alternative to `set_label()` to set variable labels.
* `var_rename()` to rename variables.

## Changes to functions

* Following functions now get an ellipses-argument `...`, to apply function only to selected variables, but return the complete data frame (thus, overwriting existing variables in a data frame, if requested): `to_factor()`, `to_value()`, `to_label()`, `to_character()`, `to_dummy()`, `zap_labels()`, `zap_unlabelled()`, `zap_na_tags()`.

## Bug fixes

* Fixed bug with copying attributes from tibbles for `merge_df()`.
* Fixed wrong argument-description in docs of `frq()`.

# sjmisc 2.0.1

## General

* Removed package `coin` from Imports.

## New functions

* `count_na()` to print a frequency table of tagged NA values.

## Changes to functions

* `set_na()` gets a `drop.levels` argument to keep or drop factor levels of values that have been replaced with NA.
* `set_na()` gets a `as.tag` argument to set NA values as regular or tagged NA.


# sjmisc 2.0.0

## General

* **sjmisc** now supports _tagged_ `NA` values, a new structure for labelled missing values introduced by the [haven-package](https://cran.r-project.org/package=haven). This means that functions or arguments that are no longer useful, have been removed while other functions dealing with NA values have been largely revised.
* All statistical functions have been removed and are now in a separate package, [sjstats](https://cran.r-project.org/package=sjstats).
* Removed some S3-methods for `labelled`-class, as these are now provided by the haven-package.
* Functions no longer check input for type `matrix`, to avoid conflicts with scaled vectors (that were recognized as matrix and hence treated as data frame).
* `table(*, exclude = NULL)` was changed to `table(*, useNA = "always")`, because of planned changes in upcoming R version 3.4.
* More functions (like `trim()` or `frq()`) now also have data frame- or list-methods.

## New functions

* `zap_na_tags()` to turn tagged NA values into regular NA values.
* `spread_coef()` to spread coefficients of multiple fitted models in nested data frames into columns.
* `merge_imputations()` to find the most likely imputed value for a missing value.
* `flat_table()` to print flat (proportional) tables of labelled variables.
* Added `to_character()` method.
* `big_mark()` to format large numbers with big marks.
* `empty_cols()` and `empty_rows()` to find variables or observations with exclusively NA values in a data frame.
* `remove_empty_cols()` and `remove_empty_rows()` to remove variables or observations with exclusively NA values from a data frame.

## Changes to functions
* `str_contains()` gets a `switch` argument to switch the role of `x` and `pattern`.
* `word_wrap()` coerces vectors to character if necessary.
* `to_label()` gets a `var.label` and `drop.levels` argument, and now preserves variable labels by default.
* Argument `def.value` in `get_label()` now also applies to data frame arguments.
* If factor levels are numeric and factor has value labels, these are used in `to_value()` by default.
* `to_factor()` no longer generates `NA` or `NaN`-levels when converting input into factors.

## Bug fixes
* `rec()` did not recode values, when these were the first element of a multi-line string of the `recodes` argument.
* `is_empty()` returned `NA` instead of `TRUE` for empty character vectors.
* Fixed bug with erroneous assignment of value labels to subset data when using `copy_labels()` ([#20](https://github.com/strengejacke/sjmisc/issues/20))
10 changes: 5 additions & 5 deletions R/frq.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' @title Frequencies of labelled variables
#' @title Frequency table of labelled variables
#' @name frq
#'
#' @description This function returns a frequency table of labelled vectors, as data frame.
Expand All @@ -7,9 +7,9 @@
#' according to their frequencies or not. Default is \code{"none"}, so
#' categories are not sorted by frequency. Use \code{"asc"} or
#' \code{"desc"} for sorting categories ascending or descending order.
#' @param weight.by Name of variable in \code{x} that indicated the vector of
#' weights that will be applied to weight all observations. Default is
#' \code{NULL}, so no weights are used.
#' @param weight.by Bare name, or name as string, of a variable in \code{x}
#' that indicates the vector of weights, which will be applied to weight all
#' observations. Default is \code{NULL}, so no weights are used.
#' @param auto.grp Numeric value, indicating the minimum amount of unique
#' values in a variable, at which automatic grouping into smaller units
#' is done (see \code{\link{group_var}}). Default value for \code{auto.group}
Expand All @@ -36,7 +36,7 @@
#' The \code{print()}-method adds a table header with information on the
#' variable label, variable type, total and valid N, and mean and
#' standard deviations. Mean and SD are \emph{always} printed, even for
#' categorical vriables (factors) or character vectors. In this case,
#' categorical variables (factors) or character vectors. In this case,
#' values are coerced into numeric vector to calculate the summary
#' statistics.
#'
Expand Down
28 changes: 13 additions & 15 deletions R/row_sums.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
#' @title Row sums and means for data frames
#' @name row_sums
#'
#' @description \code{row_sums()} simply wraps \code{\link{rowSums}}, while
#' \code{row_means()} simply wraps \code{\link[sjstats]{mean_n}},
#' however, the argument-structure of both functions is designed
#' to work nicely within a pipe-workflow and allows select-helpers
#' for selecting variables and the return value is always a tibble
#' (with one variable).
#' @description \code{row_sums()} and \code{row_means()} compute row sums or means
#' for at least \code{n} valid values per row. The functions are designed
#' to work nicely within a pipe-workflow and allow select-helpers
#' for selecting variables.
#'
#' @param n May either be
#' \itemize{
Expand All @@ -20,17 +18,17 @@
#' @inheritParams rec
#'
#' @return For \code{row_sums()}, a tibble with a new variable: the row sums from
#' \code{x}; for \code{row_means()}, a tibble with a new variable: the row
#' means from \code{x}. If \code{append = FALSE}, only the new variable
#' with row sums resp. row means is returned. \code{total_mean()} returns
#' the mean of all values from all specified columns in a data frame.
#' \code{x}; for \code{row_means()}, a tibble with a new variable: the row
#' means from \code{x}. If \code{append = FALSE}, only the new variable
#' with row sums resp. row means is returned. \code{total_mean()} returns
#' the mean of all values from all specified columns in a data frame.
#'
#' @details For \code{n}, must be a numeric value from \code{0} to \code{ncol(x)}. If
#' a \emph{row} in \code{x} has at least \code{n} non-missing values, the
#' row mean or sum is returned. If \code{n} is a non-integer value from 0 to 1,
#' \code{n} is considered to indicate the proportion of necessary non-missing
#' values per row. E.g., if \code{n = .75}, a row must have at least \code{ncol(x) * n}
#' non-missing values for the row mean or sum to be calculated. See 'Examples'.
#' a \emph{row} in \code{x} has at least \code{n} non-missing values, the
#' row mean or sum is returned. If \code{n} is a non-integer value from 0 to 1,
#' \code{n} is considered to indicate the proportion of necessary non-missing
#' values per row. E.g., if \code{n = .75}, a row must have at least \code{ncol(x) * n}
#' non-missing values for the row mean or sum to be calculated. See 'Examples'.
#'
#' @examples
#' data(efc)
Expand Down
2 changes: 1 addition & 1 deletion docs/CODE_OF_CONDUCT.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/CONTRIBUTING.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/LICENSE-text.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/articles/design_philosophy.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions docs/articles/exploringdatasets.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/articles/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/authors.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 880cdc2

Please sign in to comment.