Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for configuring workflows, environment variables and defaults #338

Merged
merged 5 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dev/dags/example_dag_factory.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ default:
catchup: false,
start_date: 2024-11-11

# ----8<--- [ start: example_dag_yaml_configuration ]
basic_example_dag:
default_args:
owner: "custom_owner"
Expand All @@ -21,3 +22,4 @@ basic_example_dag:
operator: airflow.operators.bash_operator.BashOperator
bash_command: "echo 2"
dependencies: [task_1]
# ----8<--- [ end: example_dag_yaml_configuration ]
3 changes: 3 additions & 0 deletions dev/dags/example_dag_factory_multiple_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ default:
on_failure_callback_name: print_hello_from_callback
on_failure_callback_file: $CONFIG_ROOT_DIR/print_hello.py


# ----8<--- [ start: environment_variable_example ]
example_dag:
default_args:
owner: "custom_owner"
Expand All @@ -36,6 +38,7 @@ example_dag:
python_callable_name: print_hello
python_callable_file: $CONFIG_ROOT_DIR/print_hello.py
dependencies: [task_1]
# ----8<--- [ end: environment_variable_example ]

example_dag2:
default_args:
Expand Down
21 changes: 21 additions & 0 deletions docs/features/configuring_workflows.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Configuring Your Workflows

DAG Factory allows you to define workflows in a structured, configuration-driven way using YAML files.
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved

## Key Elements of Workflow Configuration

- **dag_id**: Unique identifier for your DAG.
- **default_args**: Common arguments for all tasks.
- **schedule**/**schedule_interval**: Specifies the execution schedule.
- **tasks**: Defines the [Airflow tasks](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/tasks.html) in your workflow.

### Example DAG Configuration

```title="example_dag_factory.yml"
--8<-- "dev/dags/example_dag_factory.yml:example_dag_yaml_configuration"
```

### Check out more configuration params

- [Environment variables](./environment_variables.md)
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved
- [Defaults](./defaults.md)
57 changes: 57 additions & 0 deletions docs/features/defaults.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Defaults

DAG Factory allows you to define Airflow
[default_args](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html#default-arguments) and
additional DAG-level arguments in a `default` block. This block enables you to share common settings across all DAGs in
your YAML configuration, with the arguments automatically applied to each DAG defined in the file.

## Benefits of using the default block

- Consistency: Ensures uniform configurations across all tasks and DAGs.
- Maintainability: Reduces duplication by centralizing common properties.
- Simplicity: Makes configurations easier to read and manage.

### Example usage of default block

```title="Usage of default block in YAML"
--8<-- "dev/dags/example_task_group.yml"
```

The arguments specified in the `default` block, such as `default_args`, `default_view`, `max_active_runs`,
`schedule_interval`, and any others defined, will be applied to all the DAGs in the YAML configuration.

## Multiple ways for specifying Airflow default_args

DAG Factory offers flexibility in defining Airflow’s default_args. These can be specified in several ways, depending on your requirements.
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved

1. Specifying `default_args` in the `default` block

As seen in the previous example, you can define shared `default_args` for all DAGs in the configuration YAML under
the `default` block. These arguments are automatically inherited by every DAG defined in the file.

2. Specifying `default_args` directly in a DAG configuration

You can override or define specific default_args at the individual DAG level. This allows you to customize arguments
for each DAG without affecting others.

Example:

```title="DAG level default_args"
--8<-- "dev/dags/example_dag_factory.yml"
```

3. Specifying `default_args` in a shared `defaults.yml`

Starting DAG Factory 0.22.0, you can also keep the `default_args` in the `defaults.yml` file. The configuration
from `defaults.yml` will be applied to all DAG Factory generated DAGs.

```title="defaults.yml"
--8<-- "dev/dags/defaults.yml"
```

Given the various ways to specify `default_args`, the following precedence order is applied when arguments are
duplicated:

1. In the DAG configuration
2. In the default block
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved
3. In the defaults.yml
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved
18 changes: 18 additions & 0 deletions docs/features/environment_variables.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Environment variables

Starting release `0.20.0`, DAG Factory introduces support for referencing environment variables directly within YAML
configuration files. This enhancement enables dynamic configuration paths and enhances workflow portability by
resolving environment variables during DAG parsing.

With this feature, DAG Factory removes the reliance on hard-coded paths, allowing for more flexible and adaptable
configurations that work seamlessly across various environments.

## Example YAML Configuration with Environment Variables

```title="Reference environment variable in YAML"
--8<-- "dev/dags/example_dag_factory_multiple_config.yml:environment_variable_example"
```

In the above example, `$CONFIG_ROOT_DIR` is used to reference an environment variable that points to the root
directory of your DAG configurations. During DAG parsing, it will be resolved to the value specified for the
`CONFIG_ROOT_DIR` environment variable.
16 changes: 7 additions & 9 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,18 @@ Everything you need to know about how to build Apache Airflow® workflows using
Are you new to DAG Factory? This is the place to start!

* DAG Factory at a glance
* [Quickstart with Airflow standalone](getting-started/quick-start-airflow-standalone.md)
* [Quickstart with Astro CLI](getting-started/quick-start-astro-cli.md)
* [Quickstart with Airflow standalone](getting-started/quick-start-airflow-standalone.md)
* [Quickstart with Astro CLI](getting-started/quick-start-astro-cli.md)
* Install guide
* [Using YAML instead of Python](./comparison/index.md)
* [Traditional Airflow Operators](./comparison/traditional_operators.md)
* [TaskFlow API](./comparison/taskflow_api.md)
* [Traditional Airflow Operators](./comparison/traditional_operators.md)
* [TaskFlow API](./comparison/taskflow_api.md)

## Features

* Configuring your workflows
* Environment variables
* Defaults
* Defining actions upon task states
* Callbacks
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved
* [Configuring your workflows](./features/configuring_workflows.md)
* [Environment variables](./features/environment_variables.md)
* [Defaults](./features/defaults.md)
* Dynamically creating tasks during runtime
* Dynamic task mapping
pankajkoti marked this conversation as resolved.
Show resolved Hide resolved

Expand Down
Loading