|
| 1 | +--- |
| 2 | +title: Apache Airflow Integration Guide |
| 3 | +description: Collect and monitor Apache Airflow logs and metrics with OpenTelemetry Collector and visualize them in OpenObserve. |
| 4 | +--- |
| 5 | + |
| 6 | +# Integration with Apache Airflow |
| 7 | + |
| 8 | +This guide explains how to monitor **Apache Airflow** using the [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) (`otelcol`) and export logs, metrics, and traces to **OpenObserve** for visualization. |
| 9 | + |
| 10 | +## Overview |
| 11 | + |
| 12 | +Apache Airflow is a **workflow automation and orchestration tool** widely used for ETL pipelines, ML workflows, and data engineering tasks. Monitoring Airflow is critical for ensuring workflow reliability, debugging issues, and tracking system performance. |
| 13 | +</br> |
| 14 | + |
| 15 | +With OpenTelemetry and OpenObserve, you gain **real-time observability** into Airflow DAG runs, task execution, scheduler activity, and worker performance. |
| 16 | + |
| 17 | + |
| 18 | + |
| 19 | +## Steps to Integrate |
| 20 | + |
| 21 | +??? "Prerequisites" |
| 22 | + - OpenObserve account ([Cloud](https://cloud.openobserve.ai/web/) or [Self-Hosted](../../getting-started.md)) |
| 23 | + - Apache Airflow installed and running |
| 24 | + - Basic understanding of Airflow configs (`airflow.cfg`) |
| 25 | + - OpenTelemetry Collector installed |
| 26 | + |
| 27 | +??? "Step 1: Configure Airflow for OpenTelemetry" |
| 28 | + |
| 29 | + Edit `airflow.cfg` to enable OTel metrics: |
| 30 | + |
| 31 | + ```ini |
| 32 | + [metrics] |
| 33 | + otel_on = True |
| 34 | + otel_host = localhost |
| 35 | + otel_port = 4318 |
| 36 | + ``` |
| 37 | + |
| 38 | + Restart Airflow services after updating config: |
| 39 | + |
| 40 | + ```bash |
| 41 | + airflow db migrate |
| 42 | + airflow scheduler -D |
| 43 | + airflow webserver -D |
| 44 | + ``` |
| 45 | + |
| 46 | +??? "Step 2: Install OpenTelemetry Collector" |
| 47 | + |
| 48 | + 1. Download and install the OTel Collector: |
| 49 | + ```bash |
| 50 | + wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/latest/download/otelcol-linux-amd64 |
| 51 | + chmod +x otelcol-linux-amd64 |
| 52 | + sudo mv otelcol-linux-amd64 /usr/local/bin/otelcol |
| 53 | + ``` |
| 54 | + |
| 55 | + 2. Verify installation: |
| 56 | + ```bash |
| 57 | + otelcol --version |
| 58 | + ``` |
| 59 | + |
| 60 | +??? "Step 3: Get OpenObserve Endpoint and Token" |
| 61 | + |
| 62 | + 1. In OpenObserve: go to **Data Sources → Otel Collector** |
| 63 | + 2. Copy the **Ingestion URL** and **Access Token** |
| 64 | +  |
| 65 | + |
| 66 | +??? "Step 4: Configure OpenTelemetry Collector" |
| 67 | + |
| 68 | + 1. Create/edit config file: |
| 69 | + ```bash |
| 70 | + sudo vi /etc/otel-config.yaml |
| 71 | + ``` |
| 72 | + |
| 73 | + 2. Add Airflow configuration: |
| 74 | + ```yaml |
| 75 | + receivers: |
| 76 | + filelog/std: |
| 77 | + include: |
| 78 | + - /airflow/logs/*/*.log |
| 79 | + - /airflow/logs/scheduler/*/*/*/*.log |
| 80 | + start_at: beginning |
| 81 | + otlp: |
| 82 | + protocols: |
| 83 | + grpc: |
| 84 | + http: |
| 85 | + |
| 86 | + processors: |
| 87 | + batch: |
| 88 | + |
| 89 | + exporters: |
| 90 | + otlphttp/openobserve: |
| 91 | + endpoint: OPENOBSERVE_ENDPOINT |
| 92 | + headers: |
| 93 | + Authorization: "OPENOBSERVE_TOKEN" |
| 94 | + stream-name: airflow |
| 95 | + |
| 96 | + service: |
| 97 | + pipelines: |
| 98 | + metrics: |
| 99 | + receivers: [otlp] |
| 100 | + processors: [batch] |
| 101 | + exporters: [otlphttp/openobserve] |
| 102 | + logs: |
| 103 | + receivers: [filelog/std, otlp] |
| 104 | + processors: [batch] |
| 105 | + exporters: [otlphttp/openobserve] |
| 106 | + traces: |
| 107 | + receivers: [otlp] |
| 108 | + processors: [batch] |
| 109 | + exporters: [otlphttp/openobserve] |
| 110 | + ``` |
| 111 | + |
| 112 | + Replace placeholders with your OpenObserve details: |
| 113 | + |
| 114 | + - `OPENOBSERVE_ENDPOINT` → API endpoint (e.g., `https://api.openobserve.ai`) |
| 115 | + - `OPENOBSERVE_TOKEN` → Access token |
| 116 | + |
| 117 | +??? "Step 5: Start OpenTelemetry Collector" |
| 118 | + |
| 119 | + ```bash |
| 120 | + sudo systemctl start otel-collector |
| 121 | + sudo systemctl status otel-collector |
| 122 | + journalctl -u otel-collector -f |
| 123 | + ``` |
| 124 | + |
| 125 | + > Check logs to confirm data is being sent to OpenObserve. |
| 126 | + |
| 127 | +??? "Step 6: Visualize Logs in OpenObserve" |
| 128 | + |
| 129 | + 1. Go to **Streams → airflow** in OpenObserve to query logs.Airflow logs collected include: DAG execution logs, Scheduler logs, Worker logs and Task execution logs |
| 130 | + |
| 131 | +  |
| 132 | + |
| 133 | + |
| 134 | +!!! tip "Prebuilt Dashboards" |
| 135 | + |
| 136 | + </br> |
| 137 | + [Prebuilt Airflow dashboards](https://github.com/openobserve/dashboards/tree/main/Airflow) are available. You can download the JSON file and import it. |
| 138 | + |
| 139 | +## Troubleshooting |
| 140 | + |
| 141 | +- **No Logs in OpenObserve** |
| 142 | + |
| 143 | + - Ensure `filelog` receiver paths match your Airflow log directory. |
| 144 | + - Verify Collector service is running. |
| 145 | + |
| 146 | +- **Metrics Not Visible** |
| 147 | + |
| 148 | + - Check `otel_on = True` in `airflow.cfg`. |
| 149 | + - Confirm Airflow is sending metrics to `localhost:4318`. |
| 150 | + |
| 151 | +- **Collector Fails to Start** |
| 152 | + |
| 153 | + - Run dry check: |
| 154 | + ```bash |
| 155 | + otelcol --config /etc/otel-config.yaml --dry-run |
| 156 | + ``` |
| 157 | + - Fix syntax or missing receivers. |
| 158 | + |
0 commit comments