Skip to content

Commit 53fa751

Browse files
authored
Update regexp-substr.md (#1703)
1 parent 522a105 commit 53fa751

File tree

1 file changed

+28
-0
lines changed

1 file changed

+28
-0
lines changed

docs/en/sql-reference/20-sql-functions/06-string-functions/regexp-substr.md

+28
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,34 @@ title: REGEXP_SUBSTR
44

55
Returns the substring of the string `expr` that matches the regular expression specified by the pattern `pat`, NULL if there is no match. If expr or pat is NULL, the return value is NULL.
66

7+
- REGEXP_SUBSTR does not support extracting capture groups (subpatterns defined by parentheses `()`). It returns the entire matched substring instead of specific captured groups.
8+
9+
```sql
10+
SELECT REGEXP_SUBSTR('abc123', '(\w+)(\d+)');
11+
-- Returns 'abc123' (the entire match), not 'abc' or '123'.
12+
13+
-- Alternative Solution: Use string functions like SUBSTRING and REGEXP_INSTR to manually extract the desired portion of the string:
14+
SELECT SUBSTRING('abc123', 1, REGEXP_INSTR('abc123', '\d+') - 1);
15+
-- Returns 'abc' (extracts the part before the digits).
16+
SELECT SUBSTRING('abc123', REGEXP_INSTR('abc123', '\d+'));
17+
-- Returns '123' (extracts the digits).
18+
```
19+
20+
- REGEXP_SUBSTR does not support the `e` parameter (used in Snowflake to extract capture groups) or the `group_num` parameter for specifying which capture group to return.
21+
22+
```sql
23+
SELECT REGEXP_SUBSTR('abc123', '(\w+)(\d+)', 1, 1, 'e', 1);
24+
-- Error: Databend does not support the 'e' parameter or capture group extraction.
25+
26+
-- Alternative Solution: Use string functions like SUBSTRING and LOCATE to manually extract the desired substring, or preprocess the data with external tools (e.g., Python) to extract capture groups before querying.
27+
SELECT SUBSTRING(
28+
REGEXP_SUBSTR('letters:abc,numbers:123', 'letters:[a-z]+,numbers:[0-9]+'),
29+
LOCATE('letters:', 'letters:abc,numbers:123') + 8,
30+
LOCATE(',', 'letters:abc,numbers:123') - (LOCATE('letters:', 'letters:abc,numbers:123') + 8)
31+
);
32+
-- Returns 'abc'
33+
```
34+
735
## Syntax
836

937
```sql

0 commit comments

Comments
 (0)