Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DuckDB #214

Open
3 tasks
volcan01010 opened this issue Sep 16, 2024 · 0 comments
Open
3 tasks

Add support for DuckDB #214

volcan01010 opened this issue Sep 16, 2024 · 0 comments

Comments

@volcan01010
Copy link
Collaborator

Summary

As a data analyst, I want support for DuckDB so that I can easily move data between DuckDB and other databases and also benefit from DuckDBs fast CSV loading capabilities.

Description

DuckDB is an in-memory, column-based analytical database. It is designed for working with large files. The column-based design makes it good for analytical work analytical queries e.g. based on aggregations over whole columns or large joins.

https://duckdb.org/why_duckdb

DuckDB works as a standalone application, in a similar way to SQLite. It comes with nice tools for reading CSV files that can do neat things like auto-detect data types. These could be useful in ETL workflows.

https://duckdb.org/docs/data/csv/overview

Adding DuckDB support to ETL Helper would allow users to use the etl.copy_rows to pull data from PostgreSQL/Oracle etc. directly into DuckDB for analysis.

Implementation

The DuckDB Python library is compatible with the DB API 2.0 specification that ETL Helper uses.

https://duckdb.org/docs/api/python/dbapi

This should make it easy to add to ETL Helper. It is just a case of adding a DuckDbHelper and the appropriate tests: https://github.com/BritishGeologicalSurvey/etlhelper/blob/main/CONTRIBUTING.md#support-for-more-database-types

Acceptance criteria

  • DbHelper for DuckDB is added
  • Integration test suite added and runs in GitHub actions
  • Documentation updated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant