Python Data Engineering Challenge

Our system ingests search term data from Google Ads API into a PostgreSQL database, via an AWS S3 Data Lake.

Once ingested we score each search term with its Return On Ad Spend (ROAS).

ROAS = conversion value / cost

Three CSVs have been given - campaigns.csv, adgroups.csv and search_terms.csv.

First ingest these 3 CSVs into a database, ensure the data ingestion is idempotent.

Secondly, the adgroup alias is in the format:

Shift - Shopping - <country> - <campaign structure value> - <priority> - <random string> - <hash>

We sometimes need to know the ROAS aggregated by country and/or by priority.

Build something to allow for those aggregations to be queried easily.

Please fork this repo to complete the challenge, once done email back link to your repo.

Good luck we are rooting for you!

Provide feedback