-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3bc4993
commit 446b15a
Showing
1 changed file
with
13 additions
and
67 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,83 +1,29 @@ | ||
**What's with the name?** | ||
|
||
# droughty | ||
#### adjective, drought·i·er, drought·i·est. | ||
#### dry. | ||
|
||
[![Build Status](https://travis-ci.org/joemccann/dillinger.svg?branch=master)](https://travis-ci.org/joemccann/dillinger) | ||
It helps keep your workflow *ah hem* dry | ||
|
||
================== | ||
|
||
**What is droughty?** | ||
|
||
droughty is an analytics engineering toolkit, helping keep your workflow dry. Current tools are: | ||
|
||
- lookml - generates a base layer.lkml file with views and explores from a warehouse schema | ||
- dbt-tests - generates a base schema from specified warehouse schemas. Includes standard testing routines | ||
- lookml - generates a lkml with views, explores and measures from a warehouse schema | ||
- dbt - generates a base schema from specified warehouse schemas. Includes standard testing routines | ||
- dbml - generates an ERD based on the warehouse layer of your warehouse. Includes pk, fk relationships | ||
- cube - generates a cube schema including dimensions, integrations and meassures | ||
|
||
The purpose of this project is to automate the repetitive, dull elements of analytics engineering in the modern data stack. It turns out this also leads to cleaner projects, less human error and increases the likelihood of the basics getting done... | ||
|
||
## Tech | ||
|
||
droughty uses a number of open-source projects to work properly: | ||
|
||
- [lkml](https://pypi.org/project/lkml/) - This project uses lkml as its base parser - John Temple | ||
- [ruamel.yaml](https://pypi.org/project/ruamel.yaml/) - Yaml parser - Anthon van der Neut | ||
|
||
some more generic: | ||
|
||
- Pandas | ||
- Python Git | ||
- Click | ||
- Pandas GBQ | ||
- Protobuf | ||
- snowflake_connector_python | ||
|
||
##Considerations | ||
|
||
You need to run Droughty from a git repo. It uses the Git package to control certain relative dirs | ||
Currently the cli sub-commands have an issue where all they are not mutally exclusive. This needs to be resolved but doesn't impact usage dramatically. | ||
|
||
|
||
And of course droughty itself is open source with a [public repository][dill] | ||
on GitHub. | ||
|
||
## Installation | ||
|
||
- pip install droughty | ||
- profile.yaml set-up | ||
|
||
|
||
## Profile.yaml example | ||
|
||
droughty_demo: | ||
|
||
host: | ||
|
||
key_file: /Users/droughty_user/[key_file] | ||
|
||
password: | ||
|
||
port: | ||
|
||
project_name: example-project | ||
|
||
schema_name: analytics_qa | ||
|
||
user: | ||
|
||
warehouse_name: big_query | ||
|
||
test_schemas: | ||
|
||
example_dev_staging | ||
|
||
example_dev_integration | ||
|
||
example_analytics_dev | ||
|
||
|
||
Droughty depends on a droughty_project.yaml file. There are plans to extend the variables available within a project but for the moment it simply instructs Droughty what profile.yaml project you want to run against. | ||
**ReadTheDocs** | ||
|
||
## droughty_project.yaml example | ||
https://droughty.readthedocs.io/en/latest/ | ||
|
||
profile: droughty_demo | ||
|
||
## License | ||
|
||
MIT | ||
MIT |