- Ran Meltano commands
meltano init
meltano add --custom extractor tap-postgres
meltano add --custom loader target-postgres
- Manually set config vars in
meltano.yml
for tap and target. - Ran
meltano install
- Created
.env_template
file with default values. - Copied
.env_template
to.env
- Ran
docker-compose up -d warehouse_db
from separate docker-compose virtual env. - Added
robmoore/adventureworks-for-postgres
todocker-compose.yml
and then randocker-compose up -d source_db
from docker-compose virtual env. - Added tables for tap
meltano invoke tap-postgres --discover
meltano select tap-postgres --list --all
meltano select tap-postgres "salesorderdetail" "*"
- Tested tap is working
meltano invoke tap-postgres --properties .meltano/run/tap-postgres/tap.properties.json
- Added
dbt
meltano add transformer dbt
- Added airflow
meltano add orchestrator airflow
And then manually edited sql_alchemy_conn
value to point to postgres with
sql_alchemy_conn: postgresql+psycopg2://$PG_USERNAME:$PG_PASSWORD@$PG_ADDRESS:$PG_PORT/$PG_DATABASE
and load_examples: 'False'
.
- Initialize airflow db
meltano invoke airflow initdb
- Start the airflow ui
meltano invoke airflow webserver -D
- Start the airflow scheduler, enabling background job processing
meltano invoke airflow scheduler -D
- Scheduled job
meltano schedule adventureworks tap-postgres target-postgres @hourly --transform run
- Unpause DAG
Log into Airflow at http://localhost:8080 and click on the 'off' control to the left.
See https://www.sqldatadictionary.com/AdventureWorks2014/ for data dictionary.
Generated bootstrap model using dbt-helper
from the transform
directory:
dbt-helper bootstrap --profiles-dir ./profile/ --schemas raw
See also Running Airflow on Heroku.