Skip to content

Commit

Permalink
Update README examples
Browse files Browse the repository at this point in the history
  • Loading branch information
lfittl committed Mar 18, 2021
1 parent 25c3202 commit 8357aac
Showing 1 changed file with 88 additions and 34 deletions.
122 changes: 88 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# libpg_query [![Build Status](https://travis-ci.org/lfittl/libpg_query.svg?branch=master)](https://travis-ci.org/lfittl/libpg_query)
# libpg_query

C library for accessing the PostgreSQL parser outside of the server.

Expand All @@ -12,7 +12,7 @@ You can find further background to why a query's parse tree is useful here: http
## Installation

```sh
git clone -b 10-latest git://github.com/lfittl/libpg_query
git clone -b 13-latest git://github.com/lfittl/libpg_query
cd libpg_query
make
```
Expand Down Expand Up @@ -47,10 +47,39 @@ Compile it like this:
cc -Ilibpg_query -Llibpg_query example.c -lpg_query
```

This will output:
This will output the parse tree (whitespace adjusted here for better readability):

```json
[{"SelectStmt": {"targetList": [{"ResTarget": {"val": {"A_Const": {"val": {"Integer": {"ival": 1}}, "location": 7}}, "location": 7}}], "op": 0}}]
{
"version": 130002,
"stmts": [
{
"stmt": {
"SelectStmt": {
"targetList": [
{
"ResTarget": {
"val": {
"A_Const": {
"val": {
"Integer": {
"ival": 1
}
},
"location": 7
}
},
"location": 7
}
}
],
"limitOption": "LIMIT_OPTION_DEFAULT",
"op": "SETOP_NONE"
}
}
}
]
}
```

## Usage: Scanning a query into its tokens using the PostgreSQL scanner/lexer
Expand All @@ -61,45 +90,61 @@ syntax highlighting, where one is mostly concerned with differentiating keywords
from identifiers and other parts of the query:

```c
#include <pg_query.h>
#include <stdio.h>

#include <pg_query.h>
#include "protobuf/pg_query.pb-c.h"

int main() {
PgQueryScanResult result;
PgQuery__ScanResult *scan_result;
PgQuery__ScanToken *scan_token;
const ProtobufCEnumValue *token_kind;
const ProtobufCEnumValue *keyword_kind;
const char *input = "SELECT update AS left /* comment */ FROM between";

result = pg_query_scan(input);
scan_result = pg_query__scan_result__unpack(NULL, result.pbuf.len, (void *) result.pbuf.data);

printf(" version: %d, tokens: %ld, size: %d\n", scan_result->version, scan_result->n_tokens, result.pbuf.len);
for (size_t j = 0; j < scan_result->n_tokens; j++) {
scan_token = scan_result->tokens[j];
token_kind = protobuf_c_enum_descriptor_get_value(&pg_query__token__descriptor, scan_token->token);
keyword_kind = protobuf_c_enum_descriptor_get_value(&pg_query__keyword_kind__descriptor, scan_token->keyword_kind);
printf(" \"%.*s\" = [ %d, %d, %s, %s ]\n", scan_token->end - scan_token->start, &(input[scan_token->start]), scan_token->start, scan_token->end, token_kind->name, keyword_kind->name);
}

result = pg_query_scan("SELECT update AS left /* comment */ FROM between");

printf("%s\n", result.scan_output);

pg_query__scan_result__free_unpacked(scan_result, NULL);
pg_query_free_scan_result(result);

return 0;
}
```

This will output the following:

```json
[
[ 0, 6, 597, 3 ],
[ 7, 13, 597, 3 ],
[ 14, 20, 663, 0 ],
[ 21, 23, 290, 3 ],
[ 24, 28, 474, 2 ]
[ 43, 47, 417, 3 ],
[ 48, 55, 302, 1 ]
]
```
version: 130002, tokens: 7, size: 77
"SELECT" = [ 0, 6, SELECT, RESERVED_KEYWORD ]
"update" = [ 7, 13, UPDATE, UNRESERVED_KEYWORD ]
"AS" = [ 14, 16, AS, RESERVED_KEYWORD ]
"left" = [ 17, 21, LEFT, TYPE_FUNC_NAME_KEYWORD ]
"/* comment */" = [ 22, 35, C_COMMENT, NO_KEYWORD ]
"FROM" = [ 36, 40, FROM, RESERVED_KEYWORD ]
"between" = [ 41, 48, BETWEEN, COL_NAME_KEYWORD ]
```

Where the each element in the array represents a token and has the following fields:
Where the each element in the token list has the following fields:

1. Start location in the source string
2. End location in the source string
3. Token value - see `protobuf/scan_output.proto`
4. Keyword type:
`-1`: Not a keyword
`0`: Unreserved keyword (available for use as any kind of unescaped name)
`1`: Unreserved keyword (can be unescaped column/table/etc names, cannot be unescaped function or type name)
`2`: Reserved keyword (can be unescaped function or type name, cannot be unescaped column/table/etc names)
`3`: Reserved keyword (cannot be unescaped column/table/variable/type/function names)
3. Token value - see Token type in `protobuf/pg_query.proto`
4. Keyword type - see KeywordKind type in `protobuf/pg_query.proto`, possible values:
`NO_KEYWORD`: Not a keyword
`UNRESERVED_KEYWORD`: Unreserved keyword (available for use as any kind of unescaped name)
`COL_NAME_KEYWORD`: Unreserved keyword (can be unescaped column/table/etc names, cannot be unescaped function or type name)
`TYPE_FUNC_NAME_KEYWORD`: Reserved keyword (can be unescaped function or type name, cannot be unescaped column/table/etc names)
`RESERVED_KEYWORD`: Reserved keyword (cannot be unescaped column/table/variable/type/function names)

Note that whitespace does not show as tokens.

Expand All @@ -120,7 +165,7 @@ int main() {

result = pg_query_fingerprint("SELECT 1");

printf("%s\n", result.hexdigest);
printf("%s\n", result.fingerprint_str);

pg_query_free_fingerprint_result(result);
}
Expand All @@ -129,7 +174,7 @@ int main() {
This will output:

```
8e1acac181c6d28f4a923392cf1c4eda49ee4cd2
50fde20626009aba
```

See https://github.com/lfittl/libpg_query/wiki/Fingerprinting for the full fingerprinting rules.
Expand Down Expand Up @@ -174,18 +219,27 @@ This will output:
```json
[
{"PLpgSQL_function": {"datums": [{"PLpgSQL_var": {"refname": "found", "datatype": {"PLpgSQL_type": {"typname": "UNKNOWN"}}}}], "action": {"PLpgSQL_stmt_block": {"lineno": 1, "body": [{"PLpgSQL_stmt_if": {"lineno": 1, "cond": {"PLpgSQL_expr": {"query": "SELECT v_version IS NULL"}}, "then_body": [{"PLpgSQL_stmt_return": {"lineno": 1, "expr": {"PLpgSQL_expr": {"query": "SELECT v_name"}}}}]}}, {"PLpgSQL_stmt_return": {"lineno": 1, "expr": {"PLpgSQL_expr": {"query": "SELECT v_name || '/' || v_version"}}}}]}}}}
{"PLpgSQL_function":{"datums":[{"PLpgSQL_var":{"refname":"found","datatype":{"PLpgSQL_type":{"typname":"UNKNOWN"}}}}],"action":{"PLpgSQL_stmt_block":{"lineno":1,"body":[{"PLpgSQL_stmt_if":{"lineno":1,"cond":{"PLpgSQL_expr":{"query":"SELECT v_version IS NULL"}},"then_body":[{"PLpgSQL_stmt_return":{"lineno":1,"expr":{"PLpgSQL_expr":{"query":"SELECT v_name"}}}}]}},{"PLpgSQL_stmt_return":{"lineno":1,"expr":{"PLpgSQL_expr":{"query":"SELECT v_name || '/' || v_version"}}}}]}}}}
]
```
## Versions
For stability, it is recommended you use individual tagged git versions, see CHANGELOG.
`master` reflects a PostgreSQL base version of 9.4, with a legacy output format.
New development is happening on `10-latest`, which reflects a base version of Postgres 10.
Each major version is maintained in a dedicated git branch. Only the latest Postgres stable release receives active updates.
```
| PostgreSQL Major Version | Branch | Status |
|--------------------------|-----------|---------------------|
| 13 | 13-latest | Active development |
| 12 | (n/a) | Not supported |
| 11 | (n/a) | Not supported |
| 10 | 10-latest | Critical fixes only |
| 9.6 | (n/a) | Not supported |
| 9.5 | (n/a) | Not supported |
| 9.4 | master | End Of Life |
```
## Resources
Expand All @@ -208,7 +262,7 @@ Products, tools and libraries built on pg_query:
* [pgscope](https://github.com/gjalves/pgscope)
* [pg_materialize](https://github.com/aanari/pg-materialize)
* [DuckDB](https://github.com/cwida/duckdb) ([details](https://github.com/cwida/duckdb/tree/master/third_party/libpg_query))
* and more
Please feel free to [open a PR](https://github.com/lfittl/libpg_query/pull/new/master) to add yours! :)
Expand Down

0 comments on commit 8357aac

Please sign in to comment.