-
Notifications
You must be signed in to change notification settings - Fork 7
/
README.Rmd
203 lines (147 loc) · 6.37 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
devtools::load_all()
```
<!-- badges: start -->
[![CRAN Status Badge](https://www.r-pkg.org/badges/version/furniture)](https://cran.r-project.org/package=furniture)
![Downloads](https://cranlogs.r-pkg.org/badges/grand-total/furniture)
[![R-CMD-check](https://github.com/TysonStanley/furniture/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/TysonStanley/furniture/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->
# furniture: `v1.10.0` <img src="man/figures/furniture_hex_v2_full.png" align="right" width="40%" height="40%" />
The furniture R package contains functions to help with data cleaning/tidying (e.g., `washer()`, `rowmeans()`, `rowsums()`), exploratory data analysis and reporting (e.g., `table1()`, `tableC()`, `tableF()`). It currently contains eight main functions:
1. `table1()` : gives a well-formatted table for academic publication of descriptive statistics. Very useful for quick analyses as well. Notably, `table1()` now works with `dplyr::group_by()`.
2. `tableC()` : gives a well-formatted table of correlations.
3. `tableF()` : provides a thorough frequency table for quick checks of the levels of a variable.
4. `washer()` : changes several values in a variable (very useful for changing place holder values to missing).
5. `long()` : is a wrapper of `stats::reshape()`, takes the data from wide to long format (long is often the tidy version of the data), works well with the tidyverse, and can handle unbalanced multilevel data.
6. `wide()` : also a wrapper of `stats::reshape()`, takes the data from long to wide, and like `long()`, works well with the tidyverse and can handle unbalanced multilevel data.
7. `rowmeans()` and `rowmeans.n()` : tidyverse friendly versions of `rowMeans()`, where the `rowmeans.n()` function allows `n` number of missing
8. `rowsums()` and `rowsums.n()` : tidyverse friendly versions of `rowSums()`, where the `rowsums.n()` function allows `n` number of missing
In conjunction with many other tidy tools, the package should be useful for health, behavioral, and social scientists working on quantitative research.
# Installation
The latest stable build of the package can be downloaded from CRAN via:
```{r, eval = FALSE}
install.packages("furniture")
```
You can download the developmental version via:
```{r, eval = FALSE}
remotes::install_github("tysonstanley/furniture")
```
# Using furniture
The main functions are the `table*()` functions (e.g., `table1()`, `tableC()`, `tableF()`).
```{r}
library(furniture)
```
```{r}
data("nhanes_2010")
table1(nhanes_2010,
age, marijuana, illicit, rehab,
splitby=~asthma,
na.rm = FALSE)
```
```{r}
table1(nhanes_2010,
age, marijuana, illicit, rehab,
splitby=~asthma,
output = "text2",
na.rm = FALSE)
```
```{r, warning=FALSE, message=FALSE}
library(tidyverse)
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab,
output = "text2",
na.rm = FALSE)
```
`table1()` can do bivariate significance tests as well.
```{r, warning=FALSE, message=FALSE}
library(tidyverse)
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab,
output = "text2",
na.rm = FALSE,
test = TRUE)
```
By default it does the appropriate parametric tests. However, you can change that by setting `param = FALSE` (new with `v 1.9.1`).
```{r, warning=FALSE, message=FALSE}
library(tidyverse)
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab,
output = "text2",
na.rm = FALSE,
test = TRUE,
param = FALSE,
type = "condense")
```
It can also do a total column with the stratified columns (new with `v 1.9.0`) with the `total = TRUE` argument.
```{r, warning=FALSE, message=FALSE}
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab,
output = "text2",
na.rm = FALSE,
test = TRUE,
type = "condense",
total = TRUE)
```
It can also report the statistics in addition to the p-values.
```{r, warning=FALSE, message=FALSE}
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab,
output = "text2",
na.rm = FALSE,
test = TRUE,
total = TRUE,
type = "full")
```
`table1()` can be outputted directly to other formats. All `knitr::kable()` options are available for this and there is an extra option `"latex2"` which provides a publication ready table in Latex documents.
A new function `table1_gt()` integrates the fantastic `gt` package to make beautiful HTML tables for table1.
```{r, results='asis'}
nhanes_2010 %>%
group_by(asthma) %>%
table1(age, marijuana, illicit, rehab, na.rm = FALSE) %>%
table1_gt() %>%
print()
```
The `tableC()` function gives a well-formatted correlation table.
```{r}
tableC(nhanes_2010,
age, active, vig_active,
na.rm=TRUE)
```
The `tableF()` function gives a table of frequencies.
```{r}
tableF(nhanes_2010, age)
```
In addition, the `rowmeans()` and `rowsums()` functions offer a simplified use of `rowMeans()` and `rowSums()`, particularly when using the tidyverse's `mutate()`.
```{r}
nhanes_2010 %>%
select(vig_active, mod_active) %>%
mutate(avg_active = rowmeans(vig_active, mod_active, na.rm=TRUE)) %>%
mutate(sum_active = rowsums(vig_active, mod_active, na.rm=TRUE)) %>%
head()
```
The `rowmeans.n()` and `rowsums.n()` allow `n` missing values while still calculating the mean or sum.
```{r}
df <- data.frame(
x = c(NA, 1:5),
y = c(1:5, NA)
)
df %>%
mutate(avg = rowmeans.n(x, y, n = 1))
```
## Notes
The package is most useful in conjunction with other tidy tools to get data cleaned/tidied and start exploratory data analysis. I recommend using packages such as `library(dplyr)`, `library(tidyr)`, and `library(ggplot2)` with `library(furniture)` to accomplish this.
The original function--`table1()`--is simply built for both exploratory descriptive analysis and communication of findings. See vignettes or [tysonbarrett.com](https://tysonstanley.github.io/) for several examples of its use. Also see our paper in the [R Journal](https://journal.r-project.org/archive/2017/RJ-2017-037/RJ-2017-037.pdf).