You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a data analyst, I want support for DuckDB so that I can easily move data between DuckDB and other databases and also benefit from DuckDBs fast CSV loading capabilities.
Description
DuckDB is an in-memory, column-based analytical database. It is designed for working with large files. The column-based design makes it good for analytical work analytical queries e.g. based on aggregations over whole columns or large joins.
DuckDB works as a standalone application, in a similar way to SQLite. It comes with nice tools for reading CSV files that can do neat things like auto-detect data types. These could be useful in ETL workflows.
Adding DuckDB support to ETL Helper would allow users to use the etl.copy_rows to pull data from PostgreSQL/Oracle etc. directly into DuckDB for analysis.
Implementation
The DuckDB Python library is compatible with the DB API 2.0 specification that ETL Helper uses.
Summary
As a data analyst, I want support for DuckDB so that I can easily move data between DuckDB and other databases and also benefit from DuckDBs fast CSV loading capabilities.
Description
DuckDB is an in-memory, column-based analytical database. It is designed for working with large files. The column-based design makes it good for analytical work analytical queries e.g. based on aggregations over whole columns or large joins.
https://duckdb.org/why_duckdb
DuckDB works as a standalone application, in a similar way to SQLite. It comes with nice tools for reading CSV files that can do neat things like auto-detect data types. These could be useful in ETL workflows.
https://duckdb.org/docs/data/csv/overview
Adding DuckDB support to ETL Helper would allow users to use the
etl.copy_rows
to pull data from PostgreSQL/Oracle etc. directly into DuckDB for analysis.Implementation
The DuckDB Python library is compatible with the DB API 2.0 specification that ETL Helper uses.
https://duckdb.org/docs/api/python/dbapi
This should make it easy to add to ETL Helper. It is just a case of adding a DuckDbHelper and the appropriate tests: https://github.com/BritishGeologicalSurvey/etlhelper/blob/main/CONTRIBUTING.md#support-for-more-database-types
Acceptance criteria
The text was updated successfully, but these errors were encountered: