Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to state na.rm=T for functions? #6

Open
xiekunwhy opened this issue Sep 12, 2018 · 1 comment
Open

How to state na.rm=T for functions? #6

xiekunwhy opened this issue Sep 12, 2018 · 1 comment

Comments

@xiekunwhy
Copy link

Hi,
I think NA in data is the most common case in practice, and I don't known how to specify na.rm=T or na.rm=F in some NA sensitive functions like mean sum?

Xie Kun.

@tavareshugo
Copy link
Owner

Hi @xiekunwhy

Thanks for your interest in the package, and you're absolutely right, NA is ubiquitous in data!

With the way winScan() works at the moment, you have two options: 1) remove the missing values before doing the window summaries or 2) create a custom function to pass to winScan().

I will exemplify both with some simple example data:

library(WindowScanR) # load the library

# Create some example data with missing values
df <- data.frame(x = 1:10, y = c(1, 1, 2, NA, 3, 3, NA, 3, 2, 2))

This is what it looks like:

   x  y
   1  1
   2  1
   3  2
   4 NA
   5  3
   6  3
   7 NA
   8  3
   9  2
  10  2

Option 1
Remove missing values before doing the window summaries. For example:

df_filtered <- subset(df, !is.na(y))

Which now looks like:

   x y
   1 1
   2 1
   3 2
   5 3
   6 3
   8 3
   9 2
  10 2

So you can now do:

winScan(df_filtered, position = "x", values = "y", win_size = 2, funs = c("mean", "sum"))

Option 2
Create a custom function

Instead, we can make custom functions of mean() and sum() with different defaults, for example:

mean_na_rm <- function(x, ...){
  mean(x, na.rm = TRUE, ...)
}

sum_na_rm <- function(x, ...){
  sum(x, na.rm = TRUE, ...)
}

And now we use these functions in our winScan() call with the original data:

winScan(df, position = "x", values = "y", win_size = 2, funs = c("mean_na_rm", "sum_na_rm"))

Hope this helps!


ps - for future reference, it helps if you provide with a reproducible example. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants