Skip to content

Commit

Permalink
Add support for patterns(cols=user_provided) (#6510)
Browse files Browse the repository at this point in the history
* grep(value=TRUE) in patterns()

* test user-defined cols subset of names(DT)

* vector(s)

* comment ref issue

* adjust NEWS

---------

Co-authored-by: Toby Dylan Hocking <[email protected]>
Co-authored-by: Michael Chirico <[email protected]>
  • Loading branch information
3 people authored Sep 25, 2024
1 parent 2a70e2f commit 92a29f8
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 3 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ rowwiseDT(
# 15: All values Total Total 999 999 NaN 10
```

4. `patterns()` in `melt()` combines correctly with user-defined `cols=`, which can be useful to specify a subset of columns to reshape without having to use a regex, for example `patterns("2", cols=c("y1", "y2"))` will only give `y2` even if there are other columns in the input matching `2`, [#6498](https://github.com/Rdatatable/data.table/issues/6498). Thanks to @hongyuanjia for the report, and to @tdhock for the PR.

## BUG FIXES

1. Using `print.data.table()` with character truncation using `datatable.prettyprint.char` no longer errors with `NA` entries, [#6441](https://github.com/Rdatatable/data.table/issues/6441). Thanks to @r2evans for the bug report, and @joshhwuu for the fix.
Expand Down
2 changes: 1 addition & 1 deletion R/fmelt.R
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ patterns = function(..., cols=character(0L), ignore.case=FALSE, perl=FALSE, fixe
p = unlist(L, use.names = any(nzchar(names(L))))
if (!is.character(p))
stopf("Input patterns must be of type character.")
matched = lapply(p, grep, cols, ignore.case=ignore.case, perl=perl, fixed=fixed, useBytes=useBytes)
matched = lapply(p, grep, cols, ignore.case=ignore.case, perl=perl, fixed=fixed, useBytes=useBytes, value=TRUE)
if (length(idx <- which(lengths(matched) == 0L)))
stopf(ngettext(length(idx), 'Pattern not found: [%s]', 'Patterns not found: [%s]'), brackify(p[idx]), domain=NA)
if (length(matched) == 1L) return(matched[[1L]])
Expand Down
3 changes: 3 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -12427,6 +12427,9 @@ DTout = data.table(
value2 = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j")
)
test(1866.6, melt(DT, measure.vars = patterns("^x", "^y", cols=names(DT))), DTout)
# patterns supports cols arg, #6498
test(1866.7, melt(data.table(x1=1,x2=2,y1=3,y2=4),measure.vars=patterns("2",cols=c("y1","y2"))), data.table(x1=1,x2=2,y1=3,variable=factor("y2"),value=4))
test(1866.8, DT[, lapply(.SD, sum), .SDcols=patterns("2",cols=c("x1","x2"))], data.table(x2=40L))

# informative errors for bad user-provided cols arg to patterns
DT = data.table(x1=1,x2=2,y1=3,y2=4)
Expand Down
4 changes: 2 additions & 2 deletions man/patterns.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
\alias{patterns}
\title{Obtain matching indices corresponding to patterns}
\description{
\code{patterns} returns the matching indices in the argument \code{cols}
corresponding to the regular expression patterns provided. The patterns must be
\code{patterns} returns the elements of \code{cols}
that match the regular expression patterns, which must be
supported by \code{\link[base]{grep}}.

From \code{v1.9.6}, \code{\link{melt.data.table}} has an enhanced functionality
Expand Down

0 comments on commit 92a29f8

Please sign in to comment.