Skip to content

pinecorpca/airflow-local-runner

 
 

About this repo

This repository spins up a local Apache Airflow environment using the official MWAA production image from amazon-mwaa-docker-images. That way your local stack matches production (same image, same dependencies, e.g. OUTLETS and other decorator/API behavior).

This repo is responsible for: building the official image (when needed) and starting all Docker containers (postgres, SQS, migratedb, webserver, scheduler, worker).

Another repo (or you) is responsible for: setting environment variables so that DAGs, plugins, requirements, and startup script come from your project, then calling this repo to start the stack.

Environment variables (contract)

Set these so the running containers use your paths instead of this repo’s defaults:

Variable Meaning Default if unset
MWAA_DAGS_PATH Host path to DAGs This repo’s dags/
MWAA_PLUGINS_PATH Host path to plugins This repo’s plugins/
MWAA_REQUIREMENTS_PATH Host path to requirements dir (with requirements.txt) This repo’s requirements/
MWAA_STARTUP_SCRIPT_PATH Host path to startup dir (with startup.sh) This repo’s startup_script/
MWAA_IMAGES_REPO_PATH Path to existing clone of amazon-mwaa-docker-images (optional) This repo’s amazon-mwaa-docker-images/ (cloned on first build)

Example (from another repo):

export MWAA_DAGS_PATH="/path/to/my-project/dags"
export MWAA_REQUIREMENTS_PATH="/path/to/my-project/dags"
export MWAA_STARTUP_SCRIPT_PATH="/path/to/my-project/dags"
cd /path/to/airflow-local-runner
./mwaa-local-env start

Prerequisites

  • Docker and Docker Compose
  • Python 3.11+ and pip3 (for building the official image and for FERNET key generation)
  • Git (to clone amazon-mwaa-docker-images on first build)

Get started

1. Build the official MWAA image

The first time, this repo will clone amazon-mwaa-docker-images (into amazon-mwaa-docker-images/ at the repo root, ignored by git), create the Python venvs they need, and build the image. Takes several minutes.

./mwaa-local-env build-image

Resulting image: amazon-mwaa-docker-images/airflow:2.10.3.

2. Start Airflow

From this repo’s root (or from another repo after setting the env vars above and cd-ing into this repo):

./mwaa-local-env start

To stop: Ctrl+C and wait for the containers to stop.

3. Access the UI

  • URL: http://localhost:8081/ (port 8081 to avoid clashing with other tools)
  • Auth: MWAA__CORE__AUTH_TYPE is set to "testing", so no login is required. (You can change this via custom config if needed.)

4. DAGs, requirements, plugins, startup

  • DAGs: Add DAGs in the directory you set as MWAA_DAGS_PATH (or in this repo’s dags/ if you don’t set it).

  • Requirements: Put a requirements.txt in the directory you set as MWAA_REQUIREMENTS_PATH. The image installs it at startup. Test without starting the full stack:

    ./mwaa-local-env test-requirements
  • Plugins: Use the directory set as MWAA_PLUGINS_PATH.

  • Startup script: Put startup.sh in the directory set as MWAA_STARTUP_SCRIPT_PATH. The official MWAA image expects the script to create /tmp/customer_env_vars.json (a JSON object of env vars to export to tasks; use {} for none). This repo’s default startup_script/startup.sh does that. Test your script:

    ./mwaa-local-env test-startup-script

Commands

Command Description
./mwaa-local-env build-image Clone (if needed) and build the official MWAA image from amazon-mwaa-docker-images
./mwaa-local-env start Start the full stack (postgres, SQS, migratedb, webserver, scheduler, worker)
./mwaa-local-env reset-db Reset the Airflow DB (run if you hit “dag_stats_table already exists” or similar)
./mwaa-local-env test-requirements Run requirements install in an ephemeral container (uses current MWAA_*_PATH env)
./mwaa-local-env test-startup-script Run startup script in an ephemeral container
./mwaa-local-env validate-prereqs Check Docker, Docker Compose, Python3, pip3
./mwaa-local-env help Show help and env var summary

Note: package-requirements is not supported when using the official image; use test-requirements and your own pip download/wheel workflow if needed.

What this repo contains

  • mwaa-local-env – CLI script (build, start, reset-db, test-requirements, test-startup-script).
  • docker/docker-compose-official.yml – Full stack using the official image and env-based volume mounts.
  • docker/docker-compose-official-resetdb.yml – Reset DB using the same image and volumes.
  • docker/config/.env.localrunner – Optional env file for AWS credentials and overrides (e.g. AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY).
  • dags/, plugins/, requirements/, startup_script/ – Default content; overridden when you set MWAA_*_PATH.

The directory amazon-mwaa-docker-images/ is created on first build and is gitignored.

Troubleshooting

Environment not starting

  • If you see errors like “dag_stats_table already exists”, run:

    ./mwaa-local-env reset-db

    Then start again. You can also remove ./db-data if you want a completely fresh DB.

Fernet key issues

A Fernet key is generated and cached (e.g. in ~/.cache/mwaa-local-runner/fernet.key.json) the first time you run start or reset-db. If you clear that cache or change how the key is produced, the DB may no longer decrypt correctly; in that case run ./mwaa-local-env reset-db (and optionally delete ./db-data).

Using your own clone of amazon-mwaa-docker-images

If you already have a clone of amazon-mwaa-docker-images, set:

export MWAA_IMAGES_REPO_PATH=/path/to/your/amazon-mwaa-docker-images
./mwaa-local-env build-image

Build will use that path instead of cloning into this repo.

AWS credentials

To test DAGs that use AWS (e.g. operators), set credentials in docker/config/.env.localrunner (e.g. AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN). They are loaded by the compose stack.

@dag(outlets=...) compatibility

Upstream Apache Airflow 2.10.3 does not support outlets on the @dag decorator; AWS MWAA’s 2.10.3 image may use a patched build that does. This repo includes a plugin (plugins/dag_outlets_compat.py) that patches the decorator so DAGs that use @dag(outlets=[...], ...) parse and run locally: it sets dag.outlets on the DAG object so downstream dataset-triggered DAGs still work. When you run from the other repo without setting MWAA_PLUGINS_PATH, this repo’s plugins/ (including the compat) is mounted, so the patch is active. If you set MWAA_PLUGINS_PATH to another folder, include a copy of dag_outlets_compat.py there so the patch still loads.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 73.7%
  • Python 19.4%
  • Dockerfile 6.9%