A general-purpose bidirectional data synchronization tool that enables seamless data transfer between Dune Analytics and PostgreSQL databases. This project was developed as part of the CoW DAO Grants Program.
- Dune to PostgreSQL: Archive data from Dune queries into your local database
- PostgreSQL to Dune: Upload local data to Dune tables via the CSV upload endpoint
- Configuration-based: Simple YAML configuration for defining sources, destinations, and jobs
- Docker-ready: Easy deployment using pre-built container images
Configuration is provided in a single YAML file. Refer to the existing config.yaml
for an overview.
The configuration file consists of three main sections:
data_sources
: Defines available databasesjobs
: Defines synchronization jobs that connect sources to destinations
The config file may contain environment variable placeholders in envsubst-compatible format:
$VAR_NAME
${VAR_NAME}
$varname
Note: Every variable referenced this way must be defined at runtime, otherwise the program exits with an error.
- By default, the program looks for a file called
config.yaml
next tomain.py
- You may pass a configuration file by using the
--config
parameter- The argument to
--config
may be a filename, a file path, or a URL starting withhttp://
orhttps://
- If a URL is passed, it's downloaded and its contents are assumed to be the configuration for the program
- The argument to
- File or content served at the given URL must be valid YAML and encoded in UTF-8
Sources are defined as a list of configurations, each containing:
name
: String. A unique identifier for referencing this source in jobstype
: String. Must be eitherdune
orpostgres
key
: String. Connection details, supports environment variable templating using${VAR_NAME}
syntax such as${DB_URL}
or${DUNE_API_KEY}
(see environment setup)
Each job in the jobs
list should contain:
name
: Optional String. A human-readable name for the jobsource
: Definition of which source to use and how to fetch datadestination
: Definition of which destination to use and how to store data
For Dune sources (ref: Dune1
):
query_id
: Integer. ID of an existing Dune Query to executequery_engine
: Optional String. Eithermedium
orlarge
. Defaults tomedium
poll_frequency
: Optional Integer. Seconds between result polling. Defaults to1
.parameters
: Optional list of Dune Query parametersname
: String. Parameter nametype
: String. Must be one of:TEXT
,NUMBER
,DATE
,ENUM
value
: Any. Value for the parameter
For Postgres sources (ref: Postgres
):
query_string
: String. SQL query or path to .sql file (relative tomain.py
or absolute)
For Dune destinations (ref: Dune
):
table_name
: String. Name of Dune table to update
For Postgres destinations (ref: Postgres
):
table_name
: String. Name of table to insert/append intoif_exists
: String. One ofupsert
,insert_ignore
,replace
orappend
.index_columns
: String[]. only relevant forupsert
orinsert_ignore
a list of columns on which to detect conflict
Copy .env.sample
to .env
and fill out the two required variables
DUNE_API_KEY
- Valid API key for DuneDB_URL
- Connection string for the source and/or destination PostgreSQL database, in the formpostgresql://postgres:postgres@localhost:5432/postgres
- (Optional)
PROMETHEUS_PUSHGATEWAY_URL
- URL of a Prometheus Pushgateway which receives job-related metrics.
You can download the image from GitHub Container Registry:
docker pull ghcr.io/bh2smith/dune-sync:latest
Or build it yourself:
export IMAGE_NAME=dune-sync # (or ghcr.io image above)
docker build -t ${IMAGE_NAME} .
# Base docker command (using config.yaml mounted at /app/config.yaml)
docker run --rm \
-v "$(pwd)/config.yaml:/app/config.yaml" \
--env-file .env \
${IMAGE_NAME}
# Optional additions:
# - Mount custom config file (requires --config flag)
-v "$(pwd)/my-config.yaml:/app/my-config.yaml" \
# - Mount queries directory (if using SQL file paths)
-v "$(pwd)/queries:/app/queries" \
--config /app/my-config.yaml
# - Specify jobs to run (if not specified, all jobs will be run)
--jobs job1 job2
Note that postgres queries can also be file paths (they would also need to be mounted into the container).
Fill out the empty fields in Sample Env (e.g. DUNE_API_KEY
and DB_URL
)
docker-compose up -d # Starts postgres container (in the background)
python -m src.main [--config config.yaml] [--jobs d2p-test-1 p2d-test]
To get started with development:
python -m pip install poetry # install poetry which is used to manage the project's dependencies
poetry install # Set up virtual environment
poetry shell # Activate virtual environment
The project uses a Makefile to streamline development tasks. Here are the available commands:
make fmt
: Formats code using blackmake lint
: Runs pylint for code quality checksmake types
: Performs static type checking with mypymake check
: Runs formatting, linting, and type checking in sequencemake test
: Runs pytest with coverage reporting (minimum 93% coverage required)make clean
: Removes Python cache filesmake run
: Executes the main application (although does not accespt run time parameters)
To pass run time arguments use, for example:
python -m src.main --jobs cow-solvers