Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: postgres_fdw demo #66

Merged
merged 2 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions postgres-fdw/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/prometheus-data
/prometheus.yml
greptime.env
greptime.env.bak
119 changes: 119 additions & 0 deletions postgres-fdw/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# GreptimeDB for Postgres_fdw

This docker-compose file demos how to configure GreptimeDB with Postgres foreign
data wrapper so that you can query GreptimeDB from vanilla Postgres.

This demo uses [Vector](https://vector.dev) as a data source to generate demo
logs and ingest into GreptimeDB with built-in GreptimeDB sink.

## How to run this demo

Ensure you have `git`, `docker`, `docker-compose` and `psql` client
installed. Docker Compose version 2.24 or higher is required. To run this
demo:

```shell
git clone https://github.com/GreptimeTeam/demo-scene.git
cd demo-scene/postgres-fdw
docker compose up
```

You can access GreptimeDB using `psql` client. Just run `psql -h 127.0.0.1 -p
4003 -d public` to connect to the database and use SQL query like `SHOW TABLES`
as a start.

```
psql -h 127.0.0.1 -p 4003 -d public
psql (16.5, server 16.3-greptimedb-0.11.0)
Type "help" for help.

public=> show tables;
Tables
----------------
demo_logs_json
numbers
(2 rows)
```

Next, use `psql` to access vanilla postgres using `psql -h 127.0.0.1 -p 5432 -U
postgres`. You can check foreign servers and tables, query data from the remote
GreptimeDB.
Comment on lines +38 to +40
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to explain briefly how PostgreSQL recognizes GreptimeDB? Maybe a link to init.sh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's explained in "How it works" section below.


```
psql -h 127.0.0.1 -p 5432 -U postgres
psql (16.5, server 17.2 (Debian 17.2-1.pgdg120+1))
WARNING: psql major version 16, server major version 17.
Some psql features might not work.
Type "help" for help.

postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+-------------------+---------------+----------
public | ft_demo_logs_json | foreign table | postgres
(1 row)

postgres=# \des
List of foreign servers
Name | Owner | Foreign-data wrapper
------------+----------+----------------------
greptimedb | postgres | postgres_fdw
(1 row)

postgres=# SELECT count(*) FROM ft_demo_logs_json;
count
-------
754
(1 row)

postgres=# SELECT count(*) FROM ft_demo_logs_json WHERE method = 'GET';
count
-------
109
(1 row)

postgres=# SELECT * FROM ft_demo_logs_json WHERE method = 'GET' ORDER BY greptime_timestamp DESC LIMIT 10;
bytes | datetime | host | method | protocol | referer | request | status | user-identifier | greptime_timestamp
-------+----------------------+----------------+--------+----------+--------------------------------------------------------+------------------------------+--------+-----------------+----------------------------
17367 | 13/Dec/2024:07:33:54 | 241.75.77.209 | GET | HTTP/1.0 | https://up.nu/booper/bopper/mooper/mopper | /wp-admin | 301 | BronzeGamer | 2024-12-13 07:33:54.809142
1214 | 13/Dec/2024:07:33:52 | 95.95.146.122 | GET | HTTP/1.1 | https://for.florist/observability/metrics/production | /booper/bopper/mooper/mopper | 301 | devankoshal | 2024-12-13 07:33:52.809289
16179 | 13/Dec/2024:07:33:50 | 218.254.51.147 | GET | HTTP/1.0 | https://up.bradesco/wp-admin | /user/booperbot124 | 307 | ahmadajmi | 2024-12-13 07:33:50.809062
31284 | 13/Dec/2024:07:33:38 | 61.123.107.141 | GET | HTTP/1.0 | https://random.author/observability/metrics/production | /secret-info/open-sesame | 304 | jesseddy | 2024-12-13 07:33:38.808307
2048 | 13/Dec/2024:07:33:32 | 76.30.170.167 | GET | HTTP/2.0 | https://for.toyota/observability/metrics/production | /booper/bopper/mooper/mopper | 304 | BryanHorsey | 2024-12-13 07:33:32.808714
18429 | 13/Dec/2024:07:33:16 | 13.177.187.172 | GET | HTTP/2.0 | https://we.ikano/apps/deploy | /booper/bopper/mooper/mopper | 501 | jesseddy | 2024-12-13 07:33:16.808657
43773 | 13/Dec/2024:07:33:07 | 39.238.42.248 | GET | HTTP/1.0 | https://up.caravan/user/booperbot124 | /do-not-access/needs-work | 550 | benefritz | 2024-12-13 07:33:08.809523
26599 | 13/Dec/2024:07:33:04 | 68.55.2.213 | GET | HTTP/2.0 | https://names.fyi/this/endpoint/prints/money | /booper/bopper/mooper/mopper | 404 | shaneIxD | 2024-12-13 07:33:04.808889
38680 | 13/Dec/2024:07:32:57 | 101.219.74.21 | GET | HTTP/1.0 | https://for.eu/do-not-access/needs-work | /secret-info/open-sesame | 400 | b0rnc0nfused | 2024-12-13 07:32:58.809528
3421 | 13/Dec/2024:07:32:56 | 68.52.17.154 | GET | HTTP/2.0 | https://some.productions/this/endpoint/prints/money | /apps/deploy | 400 | shaneIxD | 2024-12-13 07:32:56.80887
(10 rows)
```

## How it works

The topology is illustrated in this diagram.

```mermaid
flowchart LR
vector[Vector]
greptimedb[(GreptimeDB)]
postgresql[(PostgreSQL)]
psql[psql]

psql --> |Postgres Wire Protocol| postgresql
postgresql --> |Postgres Wire Protocol| greptimedb
vector --> |HTTP| greptimedb

psql --> |Postgres Wire Protocol| greptimedb
```

To see how we can configure GreptimeDB for `postgres_fdw` extension, see the
[init.sh](./docker-entrypoint-initdb/init.sh).

## Run in GreptimeCloud

By default, this example writes data into a GreptimeDB instance within the
docker compose. It's also possible to write to your own GreptimeCloud instance
by creating a `greptime.env` file from our sample `greptime.env.sample` and
providing your host, dbname and authentication information.Then use `docker
compose down` and `docker compose up` to recreate the compose cluster and apply
new settings.
59 changes: 59 additions & 0 deletions postgres-fdw/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
services:
envsubst:
image: docker.io/widerplan/envsubst
command: "-i /config_input/vector.toml -o /config_output/vector.toml"
volumes:
- ./vector.toml:/config_input/vector.toml
- config:/config_output
env_file:
- path: "greptime.env"
required: false
init: true

greptimedb:
image: docker.io/greptime/greptimedb:v0.11.0
command: standalone start --http-addr=0.0.0.0:4000 --rpc-addr=0.0.0.0:4001 --mysql-addr=0.0.0.0:4002 --postgres-addr 0.0.0.0:4003
ports:
- 4000:4000
- 4001:4001
- 4002:4002
- 4003:4003
networks:
- demo-network
healthcheck:
test: [ "CMD", "curl", "-f", "http://127.0.0.1:4000/health" ]
interval: 3s
timeout: 3s
retries: 5

postgresql:
image: docker.io/postgres:17
ports:
- 5432:5432
networks:
- demo-network
volumes:
- ./docker-entrypoint-initdb.d:/docker-entrypoint-initdb.d
environment:
- POSTGRES_HOST_AUTH_METHOD=trust
env_file:
- path: "greptime.env"
required: false

vector:
image: docker.io/timberio/vector:0.43.X-alpine
networks:
- demo-network
volumes:
- config:/config_data
command: "-c /config_data/vector.toml"
depends_on:
envsubst:
condition: service_completed_successfully


networks:
demo-network:

volumes:
config:
31 changes: 31 additions & 0 deletions postgres-fdw/docker-entrypoint-initdb.d/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/usr/bin/env bash

set -e

psql -v ON_ERROR_STOP=1 --username "${POSTGRES_USER}" <<-EOSQL
CREATE EXTENSION postgres_fdw;

CREATE SERVER greptimedb
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '${GREPTIME_HOST:=greptimedb}', dbname '${GREPTIME_DB:=public}', port '4003');

CREATE USER MAPPING FOR postgres
SERVER greptimedb
OPTIONS (user '${GREPTIME_USERNAME:=greptime}', password '${GREPTIME_PASSWORD:=greptime}');

CREATE FOREIGN TABLE ft_demo_logs_json (
"bytes" INT8,
"datetime" VARCHAR,
"host" VARCHAR,
"method" VARCHAR,
"protocol" VARCHAR,
"referer" VARCHAR,
"request" VARCHAR,
"status" VARCHAR,
"user-identifier" VARCHAR,
"greptime_timestamp" TIMESTAMP
)
SERVER greptimedb
OPTIONS (table_name 'demo_logs_json');

EOSQL
7 changes: 7 additions & 0 deletions postgres-fdw/greptime.env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
GREPTIME_SCHEME=https
GREPTIME_PORT=443

GREPTIME_HOST=
GREPTIME_DB=
GREPTIME_USERNAME=
GREPTIME_PASSWORD=
20 changes: 20 additions & 0 deletions postgres-fdw/vector.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
[sources.logs_in]
type = "demo_logs"
format = "json"

[transforms.logs_json]
type = "remap"
inputs = ["logs_in"]
source = '''
. = parse_json!(.message)
'''

[sinks.logs_out2]
inputs = ["logs_json"]
type = "greptimedb_logs"
endpoint = "${GREPTIME_SCHEME:=http}://${GREPTIME_HOST:=greptimedb}:${GREPTIME_PORT:=4000}"
compression = "gzip"
dbname = "${GREPTIME_DB:=public}"
username = "${GREPTIME_USERNAME}"
password = "${GREPTIME_PASSWORD}"
table = "demo_logs_json"
Loading