Skip to content

Commit

Permalink
Indexing Substrate networks (#1168)
Browse files Browse the repository at this point in the history
  • Loading branch information
droserasprout authored Dec 23, 2024
1 parent ee5ceb9 commit 50291a7
Show file tree
Hide file tree
Showing 199 changed files with 21,865 additions and 904 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
!**/pdm.lock
!**/README.md
!**/.keep
!**/py.typed

# Add Python code
!**/*.py
Expand Down
36 changes: 36 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,42 @@
"DIPDUP_NO_SYMLINK": "1"
}
},
{
"name": "demo_substrate_events: run",
"type": "debugpy",
"request": "launch",
"module": "dipdup",
"args": [
"-e",
".env",
"run"
],
"console": "integratedTerminal",
"cwd": "${workspaceFolder}/src/demo_substrate_events",
"justMyCode": false,
"env": {
"DIPDUP_DEBUG": "1",
"DIPDUP_NO_SYMLINK": "1"
}
},
{
"name": "demo_substrate_events: init",
"type": "debugpy",
"request": "launch",
"module": "dipdup",
"args": [
"-e",
".env",
"init"
],
"console": "integratedTerminal",
"cwd": "${workspaceFolder}/src/demo_substrate_events",
"justMyCode": false,
"env": {
"DIPDUP_DEBUG": "1",
"DIPDUP_NO_SYMLINK": "1"
}
},
{
"name": "demo_evm_events: run",
"type": "debugpy",
Expand Down
29 changes: 20 additions & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,22 @@ The format is based on [Keep a Changelog], and this project adheres to [Semantic

Releases prior to 7.0 has been removed from this file to declutter search results; see the [archived copy](https://github.com/dipdup-io/dipdup/blob/8.0.0b5/CHANGELOG.md) for the full list.

## [Unreleased]
## [8.2.0rc1] - ????-??-??

### Added

- substrate.events: Added `subtrate.events` index kind to process Substrate events.
- substrate.node: Added `subtrate.node` datasource to receive data from Substrate node.
- substrate.subscan: Added `substrate.subscan` datasource to fetch ABIs from Subscan.
- substrate.subsquid: Added `substrate.subsquid` datasource to fetch historical data from Squid Network.

### Fixed

- subsquid: Fixed float type for `timestamp` field on event / transaction deserialization.
- subsquid: Fixed empty field base conversion on event deserialization.
- evm.subsquid: Fixed event/transaction model deserialization.

### Changed

- evm.etherscan: Datasource has been renamed from `abi.etherscan` to `evm.etherscan` for consistency.

## [8.1.3] - 2024-12-20

Expand Down Expand Up @@ -45,7 +55,7 @@ Releases prior to 7.0 has been removed from this file to declutter search result

### Added

- abi.etherscan: Try to extract ABI from webpage when API call fails.
- evm.etherscan: Try to extract ABI from webpage when API call fails.
- cli: Added `schema` subcommands to manage database migrations: `migrate`, `upgrade`, `downgrade`, `heads` and `history`.
- cli: Added interactive mode for `new` command.
- database: Support database migrations using [`aerich`](https://github.com/tortoise/aerich).
Expand Down Expand Up @@ -225,7 +235,7 @@ Releases prior to 7.0 has been removed from this file to declutter search result
### Removed

- config: `node_only` index config flag has been removed; add `evm.node` datasource(s) to the `datasources` list instead.
- config: `abi` index config field has been removed; add `abi.etherscan` datasource(s) to the `datasources` list instead.
- config: `abi` index config field has been removed; add `evm.etherscan` datasource(s) to the `datasources` list instead.

### Other

Expand Down Expand Up @@ -285,7 +295,7 @@ Releases prior to 7.0 has been removed from this file to declutter search result

### Fixed

- abi.etherscan: Raise `AbiNotAvailableError` when contract is not verified.
- evm.etherscan: Raise `AbiNotAvailableError` when contract is not verified.
- cli: Fixed incorrect indexer status logging.
- evm.node: Fixed memory leak when using realtime subscriptions.
- evm.node: Fixed processing chain reorgs.
Expand Down Expand Up @@ -343,7 +353,7 @@ Releases prior to 7.0 has been removed from this file to declutter search result

### Fixed

- abi.etherscan: Fixed handling "rate limit reached" errors.
- evm.etherscan: Fixed handling "rate limit reached" errors.
- cli: Fixed setting logger levels based on config and env variables.
- http: Fixed incorrect number of retries performed on failed requests.

Expand Down Expand Up @@ -511,7 +521,7 @@ Releases prior to 7.0 has been removed from this file to declutter search result

### Added

- abi.etherscan: Added `abi.etherscan` datasource to fetch ABIs from Etherscan.
- evm.etherscan: Added `evm.etherscan` datasource to fetch ABIs from Etherscan.
- api: Added `/performance` endpoint to request indexing stats.
- cli: Added `report` command group to manage performance and crash reports created by DipDup.
- config: Added `advanced.decimal_precision` field to overwrite precision if it's not guessed correctly based on project models.
Expand Down Expand Up @@ -567,7 +577,8 @@ Releases prior to 7.0 has been removed from this file to declutter search result
[semantic versioning]: https://semver.org/spec/v2.0.0.html

<!-- Versions -->
[Unreleased]: https://github.com/dipdup-io/dipdup/compare/8.1.3...HEAD
[Unreleased]: https://github.com/dipdup-io/dipdup/compare/8.2.0rc1...HEAD
[8.2.0rc1]: https://github.com/dipdup-io/dipdup/compare/8.1.3...8.2.0rc1
[8.1.3]: https://github.com/dipdup-io/dipdup/compare/8.1.2...8.1.3
[8.1.2]: https://github.com/dipdup-io/dipdup/compare/8.1.1...8.1.2
[8.1.1]: https://github.com/dipdup-io/dipdup/compare/8.1.0...8.1.1
Expand Down
181 changes: 181 additions & 0 deletions docs/0.quickstart-substrate.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
---
title: "Quickstart"
description: "This page will guide you through the steps to get your first selective indexer up and running in a few minutes without getting too deep into the details."
navigation.icon: "stars"
---

# Quickstart

::banner{type="warning"}
Substrate support is in early preview stage. API and features may change in the future.
::

This page will guide you through the steps to get your first selective indexer up and running in a few minutes without getting too deep into the details.

A selective blockchain indexer is an application that extracts and organizes specific blockchain data from multiple data sources, rather than processing all blockchain data. It allows users to index only relevant entities, reducing storage and computational requirements compared to full node indexing, and query data more efficiently for specific use cases. Think of it as a customizable filter that captures and stores only the blockchain data you need, making data retrieval faster and more resource-efficient. DipDup is a framework that helps you implement such an indexer.

Let's create an indexer for the balance transfers in AssetHub network. Our goal is to save all transfers to the database and then calculate some statistics of its holders' activity.

## Install DipDup

A modern Linux/macOS distribution with Python 3.12 installed is required to run DipDup.

The recommended way to install DipDup CLI is [pipx](https://pipx.pypa.io/stable/). We also provide a convenient helper script that installs all necessary tools. Run the following command in your terminal:

{{ #include _curl-spell.md }}

See the [Installation](../docs/1.getting-started/1.installation.md) page for all options.

After installation, run the following command to switch to the preview branch:

```shell [Terminal]
dipdup self install -f -r feat/substrate
```

## Create a project

DipDup CLI has a built-in project generator. Run the following command in your terminal:

```shell [Terminal]
dipdup new
```

Choose `From template`, then `Substrate` network and `demo_substrate_events` template.

::banner{type="note"}
Want to skip a tutorial and start from scratch? Choose `Blank` at the first step instead and proceed to the [Config](../docs/1.getting-started/3.config.md) section.
::

Follow the instructions; the project will be created in the new directory.

## Write a configuration file

In the project root, you'll find a file named `dipdup.yaml`. It's the main configuration file of your indexer. We will discuss it in detail in the [Config](../docs/1.getting-started/3.config.md) section; now it has the following content:

```yaml [dipdup.yaml]
{{ #include ../src/demo_substrate_events/dipdup.yaml }}
```

## Generate types and stubs

Now it's time to generate typeclasses and callback stubs based on definitions from config. Examples below use `demo_substrate_events` as a package name; yours may differ.

Run the following command:

```shell [Terminal]
dipdup init
```

DipDup will create a Python package `demo_substrate_events` with everything you need to start writing your indexer. Use `package tree` command to see the generated structure:

```shell [Terminal]
$ dipdup package tree
demo_substrate_events [/home/droserasprout/git/dipdup/src/demo_substrate_events]
├── abi
│ ├── assethub/v1000000.json
│ ├── assethub/v1001002.json
│ ├── ...
│ └── assethub/v9430.json
├── configs
│ ├── dipdup.compose.yaml
│ ├── dipdup.sqlite.yaml
│ ├── dipdup.swarm.yaml
│ └── replay.yaml
├── deploy
│ ├── .env.default
│ ├── Dockerfile
│ ├── compose.sqlite.yaml
│ ├── compose.swarm.yaml
│ ├── compose.yaml
│ ├── sqlite.env.default
│ └── swarm.env.default
├── graphql
├── handlers
│ ├── batch.py
│ └── on_transfer.py
├── hasura
├── hooks
│ ├── on_index_rollback.py
│ ├── on_reindex.py
│ ├── on_restart.py
│ └── on_synchronized.py
├── models
│ └── __init__.py
├── sql
├── types
│ ├── assethub/substrate_events/assets_transferred/__init__.py
│ ├── assethub/substrate_events/assets_transferred/v601.py
│ └── assethub/substrate_events/assets_transferred/v700.py
└── py.typed
```

That's a lot of files and directories! But don't worry, we will need only `models` and `handlers` sections in this guide.

## Define data models

DipDup supports storing data in SQLite, PostgreSQL and TimescaleDB databases. We use modified [Tortoise ORM](https://tortoise.github.io/) library as an abstraction layer.

First, you need to define a model class. DipDup uses model definitions both for database schema and autogenerated GraphQL API. Our schema will consist of a single model `Holder` with the following fields:

| | |
| ----------- | ----------------------------------- |
| `address` | account address |
| `balance` | token amount held by the account |
| `turnover` | total amount of transfer/mint calls |
| `tx_count` | number of transfers/mints |
| `last_seen` | time of the last transfer/mint |

Here's how to define this model in DipDup:

```python [models/__init__.py]
{{ #include ../src/demo_substrate_events/models/__init__.py }}
```

Using ORM is not a requirement; DipDup provides helpers to run SQL queries/scripts directly, see [Database](1.getting-started/5.database.md) page.

## Implement handlers

Everything's ready to implement an actual indexer logic.

Our task is to index all the balance updates. Put some code to the `on_transfer` handler callback to process matched logs:

```python [handlers/on_transfer.py]
{{ #include ../src/demo_substrate_events/handlers/on_transfer.py }}
```

And that's all! We can run the indexer now.

## Next steps

Run the indexer in memory:

```shell
dipdup run
```

Store data in SQLite database (defaults to /tmp, set `SQLITE_PATH` env variable):

```shell
dipdup -c . -c configs/dipdup.sqlite.yaml run
```

Or spawn a Compose stack with PostgreSQL and Hasura:

```shell
cd deploy
cp .env.default .env
# Edit .env file before running
docker-compose up
```

DipDup will fetch all the historical data and then switch to realtime updates. You can check the progress in the logs.

If you use SQLite, run this query to check the data:

```bash
sqlite3 /tmp/demo_substrate_events.sqlite 'SELECT * FROM holder LIMIT 10'
```

If you run a Compose stack, open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.

Congratulations! You've just created your first DipDup indexer. Proceed to the Getting Started section to learn more about DipDup configuration and features.
27 changes: 15 additions & 12 deletions docs/1.getting-started/7.datasources.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,18 +9,21 @@ Datasources are DipDup connectors to various APIs. They are defined in config an

Index datasources, ones that can be attached to a specific index, are prefixed with blockchain name, e.g. `tezos.tzkt` or `evm.subsquid`.

| kind | blockchain | description |
| ------------------------------------------------------------ | ---------------- | ------------------------------- |
| [evm.subsquid](../3.datasources/1.evm_subsquid.md) | ⟠ EVM-compatible | Subsquid Network API |
| [evm.node](../3.datasources/2.evm_node.md) | ⟠ EVM-compatible | Ethereum node |
| [abi.etherscan](../3.datasources/3.abi_etherscan.md) | ⟠ EVM-compatible | Provides ABIs for EVM contracts |
| [starknet.subsquid](../3.datasources/4.starknet_subsquid.md) | 🐺 Starknet | Subsquid Network API |
| [starknet.node](../3.datasources/5.starknet_node.md) | 🐺 Starknet | Starknet node |
| [tezos.tzkt](../3.datasources/6.tezos_tzkt.md) | ꜩ Tezos | TzKT API |
| [tzip_metadata](../3.datasources/7.tzip_metadata.md) | ꜩ Tezos | TZIP-16 metadata |
| [coinbase](../3.datasources/8.coinbase.md) | any | Coinbase price feed |
| [ipfs](../3.datasources/9.ipfs.md) | any | IPFS gateway |
| [http](../3.datasources/10.http.md) | any | Generic HTTP API |
| kind | blockchain | description |
| -------------------------------------------------------------- | ---------------- | ----------------------------------------------- |
| [evm.subsquid](../3.datasources/1.evm_subsquid.md) | ⟠ EVM-compatible | Subsquid Network API |
| [evm.node](../3.datasources/2.evm_node.md) | ⟠ EVM-compatible | Ethereum node |
| [evm.etherscan](../3.datasources/3.evm_etherscan.md) | ⟠ EVM-compatible | Provides ABIs for EVM contracts |
| [starknet.subsquid](../3.datasources/4.starknet_subsquid.md) | 🐺 Starknet | Subsquid Network API |
| [starknet.node](../3.datasources/5.starknet_node.md) | 🐺 Starknet | Starknet node |
| [substrate.node](../3.datasources/6.substrate_node.md) | 🔮 Substrate | Substrate node |
| [substrate.subscan](../3.datasources/7.substrate_subscan.md) | 🔮 Substrate | Provides pallet metadata for Substrate networks |
| [substrate.subsquid](../3.datasources/8.substrate_subsquid.md) | 🔮 Substrate | Subsquid Network API |
| [tezos.tzkt](../3.datasources/9.tezos_tzkt.md) | ꜩ Tezos | TzKT API |
| [tzip_metadata](../3.datasources/10.tzip_metadata.md) | ꜩ Tezos | TZIP-16 metadata |
| [coinbase](../3.datasources/11.coinbase.md) | any | Coinbase price feed |
| [ipfs](../3.datasources/12.ipfs.md) | any | IPFS gateway |
| [http](../3.datasources/13.http.md) | any | Generic HTTP API |

## Connection settings

Expand Down
15 changes: 8 additions & 7 deletions docs/1.getting-started/8.indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,14 @@ Multiple indexes are available for different workloads. Every index is linked to
| [evm.events](../2.indexes/1.evm_events.md) | ⟠ EVM-compatible | `evm` | event logs |
| [evm.transactions](../2.indexes/2.evm_transactions.md) | ⟠ EVM-compatible | `evm` | transactions |
| [starknet.events](../2.indexes/3.starknet_events.md) | 🐺 Starknet | `starknet` | event logs |
| [tezos.big_maps](../2.indexes/4.tezos_big_maps.md) | ꜩ Tezos | `tezos` | big map diffs |
| [tezos.events](../2.indexes/5.tezos_events.md) | ꜩ Tezos | `tezos` | events |
| [tezos.head](../2.indexes/6.tezos_head.md) | ꜩ Tezos | `tezos` | head blocks (realtime only) |
| [tezos.operations](../2.indexes/7.tezos_operations.md) | ꜩ Tezos | `tezos` | typed operations |
| [tezos.operations_unfiltered](../2.indexes/8.tezos_operations_unfiltered.md) | ꜩ Tezos | `tezos` | untyped operations |
| [tezos.token_balances](../2.indexes/9.tezos_token_balances.md) | ꜩ Tezos | `tezos` | TZIP-12/16 token balances |
| [tezos.token_transfers](../2.indexes/10.tezos_token_transfers.md) | ꜩ Tezos | `tezos` | TZIP-12/16 token transfers |
| [substrate.events](../2.indexes/4.substrate_events.md) | 🔮 Substrate | `substrate` | pallet events |
| [tezos.big_maps](../2.indexes/5.tezos_big_maps.md) | ꜩ Tezos | `tezos` | big map diffs |
| [tezos.events](../2.indexes/6.tezos_events.md) | ꜩ Tezos | `tezos` | events |
| [tezos.head](../2.indexes/7.tezos_head.md) | ꜩ Tezos | `tezos` | head blocks (realtime only) |
| [tezos.operations](../2.indexes/8.tezos_operations.md) | ꜩ Tezos | `tezos` | typed operations |
| [tezos.operations_unfiltered](../2.indexes/9.tezos_operations_unfiltered.md) | ꜩ Tezos | `tezos` | untyped operations |
| [tezos.token_balances](../2.indexes/10.tezos_token_balances.md) | ꜩ Tezos | `tezos` | TZIP-12/16 token balances |
| [tezos.token_transfers](../2.indexes/11.tezos_token_transfers.md) | ꜩ Tezos | `tezos` | TZIP-12/16 token transfers |

Indexes can join multiple contracts considered as a single application. Also, contracts can be used by multiple indexes of any kind, but make sure that they are independent of each other and that indexed data don't overlap.

Expand Down
Loading

0 comments on commit 50291a7

Please sign in to comment.