Skip to content

Commit 162c09e

Browse files
committed
Add azure adls to troubleshooting guide
1 parent 5abad5b commit 162c09e

1 file changed

Lines changed: 48 additions & 0 deletions

File tree

  • docs/modules/airflow/pages/troubleshooting

docs/modules/airflow/pages/troubleshooting/index.adoc

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,53 @@
11
= Troubleshooting
22

3+
== Azure Blob Storage Logging:
4+
5+
Azure's `ADLS` can be used to store task logs Airflow.
6+
7+
Assume a regular storage container in Azures ADLS backend, this can be accessed with either the `adls[s]` or `wasb` connector using the https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/connections/adls_v2.html[Azure Data Lake Storage Gen2 Connection] or the https://airflow.apache.org/docs/apache-airflow-providers-microsoft-azure/stable/connections/wasb.html[Microsoft Azure Blob Storage Connection] respectively.
8+
9+
Using `ADLS` as a task log backend, requires to access it via `wasb` and thus the configuration in the environment should look like:
10+
[source,yaml]
11+
----
12+
webservers:
13+
envOverrides: &logging_overrides
14+
AIRFLOW__AZURE_REMOTE_LOGGING__REMOTE_WASB_LOG_CONTAINER: "<container-name>" #<1>
15+
AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER: "wasb-<folder-name>" #<2>
16+
AIRFLOW__LOGGING__REMOTE_LOGGING: "True"
17+
AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID: "<connection-name>" #<3>
18+
triggerers:
19+
envOverrides: *logging_overrides
20+
kubernetesExecutors:
21+
envOverrides: *logging_overrides
22+
schedulers:
23+
envOverrides: *logging_overrides
24+
----
25+
<1> This env var is only used for wasb connections.
26+
<2> Note that the <container-name> is *not* referenced.
27+
<3> This connection can be defined in the AirflowUI or constructed as env variable.
28+
29+
Due to this open https://github.com/apache/airflow/issues/58946[issue] with Airflow, it's recommended to use `wasb-<folder-name>` rather then `wasb://<folder-name>` as it would cause your backend looking like:
30+
[source,text]
31+
----
32+
<container-name>
33+
└── wasb:/
34+
└── tasklogs/
35+
└── dag_id=...
36+
----
37+
However the workaround will result in
38+
[source,text]
39+
----
40+
<container-name>
41+
└── wasb-tasklogs/
42+
└── dag_id=...
43+
----
44+
45+
The `Azure Blob Storage Connection` will offer the optional field `Host` which should have the a value looking like
46+
[source,text]
47+
----
48+
https://<storage-account-name>..blob.core.windows.net
49+
----
50+
351
== S3 Logging: An error occurred (411) when calling the PutObject operation: Length Required
452

553
If Airflow is trying to access S3 (e.g. for remote task logging) and throws the following error

0 commit comments

Comments
 (0)