Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containerized environment and first cookbook #1

Merged
merged 23 commits into from
Oct 17, 2024
Merged

Containerized environment and first cookbook #1

merged 23 commits into from
Oct 17, 2024

Conversation

rachhouse
Copy link
Contributor

@rachhouse rachhouse commented Oct 3, 2024

Overview

This PR includes the first stage of the tutorial series on integrating GX into an Airflow pipeline. The key pieces are:

  • A README with instructions on how to run the tutorial locally
  • A tested, working Docker compose framework that runs: Airflow, Postgres, and JupyterLab
  • A JupyterLab notebook ("cookbook") that walks the user through integrating GX data validation in an Airflow pipeline.
  • Tests for notebook and Airflow pipeline code.

This first notebook uses a happy path to familiarize the user with the relevant pieces (before data validation fails in the pipeline in the next notebook!)

Here are direct links to the key PR files:

Review request

I'm not looking for an exhaustive review of all the PR files. Here is what I'd appreciate in your review:

  • [Does the tutorial run?] Please use the README instructions to run the tutorial locally and work through Cookbook 1. Are you able to run Cookbook 1 successfully? Do you encounter any errors or unexpected behavior?

Important

Since this work is in a branch, you'll just need to switch to the PR branch once you download the repo:

git clone https://github.com/greatexpectationslabs/tutorial-gx-in-the-data-pipeline.git
git checkout env-and-ci
  • [Is the tutorial content understandable?] I've kept the README and notebook text relatively terse in an effort to keep it direct, simple, and more easily maintainable. Are there any sections that are difficult to understand or need some work before this goes live?

@rachhouse rachhouse self-assigned this Oct 3, 2024
Copy link

@adeola-ak adeola-ak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, Rachel! The tutorial was easy to follow, informative, and very concise. It covered great material which will be extremely helpful to many users! I had no problem getting started or getting moving. Approving, however, there are just a few little points that need a quick update:

1. looks like an admonition didn't properly show up here:

Screenshot 2024-10-16 at 6 45 17 PM

2. a few broken image links:
Screenshot 2024-10-16 at 6 53 19 PM


Screenshot 2024-10-16 at 6 53 41 PM


Screenshot 2024-10-16 at 6 53 54 PM

3. lastly, not sure if those cells at the end of the summary are meant to be there.
Screenshot 2024-10-16 at 6 56 49 PM

@rachhouse
Copy link
Contributor Author

@adeola-ak I've done some experimenting and reading, and I think the reason the images aren't showing up in the GitHub rendering of the notebook is because the repo isn't public (other users have noted running into this particular problem). I'm going to go ahead and merge this PR and make the repo public. If I still see the image issue, I'll try again to fix it.

Thanks again for the review!

@rachhouse rachhouse merged commit 2dcdc37 into main Oct 17, 2024
2 checks passed
@rachhouse rachhouse deleted the env-and-ci branch October 17, 2024 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants