Skip to main content

Airflow Variables & Connections

So far, you’ve created DAGs, set task dependencies, and scheduled your workflows. βœ…

But what if your DAG needs credentials, file paths, or API keys?

Hardcoding them is risky and inflexible. Airflow solves this problem with:

  1. Variables – dynamic parameters
  2. Connections – secure credentials

Real-World Story: Configuration as a Service​

Imagine running multiple environments:

  • Dev, Test, Production
  • Same DAG code, different database credentials
  • Hardcoding these would be messy

Airflow lets you store configuration centrally and reuse it safely.


1️⃣ Airflow Variables​

What are Variables?​

Variables are key-value pairs stored in Airflow metadata.
Use them for:

  • File paths
  • API endpoints
  • Threshold values
  • Feature toggles

Example: Create a Variable in UI​

KeyValue
s3_bucketmy-data-bucket
api_key12345-ABCDE

Accessing Variables in DAG​

from airflow.models import Variable

s3_bucket = Variable.get("s3_bucket")
api_key = Variable.get("api_key", default_var="default-key")

print(s3_bucket) # Output: my-data-bucket
print(api_key) # Output: 12345-ABCDE

Input & Output Example​

Input​

  • Variable key: s3_bucket
  • Stored value: my-data-bucket

Output​

my-data-bucket

βœ… Variables can also store JSON:

config = Variable.get("my_config", deserialize_json=True)
print(config["threshold"])

2️⃣ Airflow Connections​

What are Connections?​

Connections store credentials and endpoints for:

  • Databases (Postgres, MySQL, Redshift)
  • Cloud providers (AWS, GCP, Azure)
  • APIs (HTTP, FTP)

This avoids hardcoding sensitive information in DAGs.


Example: Define a Connection in UI​

Connection IDTypeHostLoginPassword
my_postgresPostgreslocalhostuserpass123

Accessing a Connection in DAG​

from airflow.hooks.base import BaseHook

conn = BaseHook.get_connection("my_postgres")
print(conn.host) # Output: localhost
print(conn.login) # Output: user
print(conn.password) # Output: pass123

Input & Output Example​

Input​

  • Connection ID: my_postgres
  • Host: localhost
  • Login: user
  • Password: pass123

Output​

Host: localhost
Login: user
Password: pass123

Using Variables and Connections in Tasks​

from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.models import Variable
from airflow.hooks.base import BaseHook
from datetime import datetime

def print_config():
bucket = Variable.get("s3_bucket")
conn = BaseHook.get_connection("my_postgres")
print(f"S3 Bucket: {bucket}, DB Host: {conn.host}")

with DAG(
dag_id="variables_connections_dag",
start_date=datetime(2024, 1, 1),
schedule_interval="@daily",
catchup=False,
tags=["config", "airflow"],
) as dag:

task = PythonOperator(
task_id="print_config_task",
python_callable=print_config
)

βœ… Output in logs:

S3 Bucket: my-data-bucket, DB Host: localhost

Best Practices (Professional)​

βœ… Store secrets in Connections, not Variables
βœ… Use JSON Variables for structured configs
βœ… Avoid hardcoding credentials in DAGs
βœ… Leverage Environment Variables for extra security
βœ… Name Variables and Connections consistently


Common Mistakes​

❌ Hardcoding passwords in DAGs
❌ Forgetting to handle missing variables (default_var)
❌ Using Variables for sensitive credentials
❌ Changing Connection IDs without updating DAGs


SEO Key Takeaways​

  • Variables store dynamic parameters
  • Connections store secure credentials
  • Access them in Python tasks via Variable.get and BaseHook.get_connection
  • Proper use improves security and maintainability

Summary​

In this chapter, you learned:

  • Difference between Variables and Connections
  • How to create, retrieve, and use Variables
  • How to create, retrieve, and use Connections
  • Best practices for secure and maintainable configuration

🎯 Your DAGs are now configurable, secure, and production-ready.


What’s Next?​

πŸ‘‰ Templating & Jinja Expressions in Airflow
Learn how to make DAGs dynamic using:

  • Jinja templating
  • Macros
  • Runtime parameters