KubernetesPodOperator never stops if credentials are refreshed #42361

paolo-moriello · 2024-09-20T09:14:06Z

This is a follow-up of this PR: #39325

With the above change, the tenacity retry mechanism was introduced while waiting pod completion. For long running tasks, in fact, k8s credentials could expire while the task is still running, we are therefore refreshing credentials and retrying. However this did not completely solve the issue due to the stop_after_attempt(3): the job is still failing when credentials were expiring more than twice.

This PR attempts at fixing this issue by:

removing the stop_after_attempt logic
still failing the job in case credentials are invalid after refresh. we in fact still want to make sure the job doesn't run forever if it is producing 401s after refresh

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

airflow/providers/cncf/kubernetes/operators/pod.py

…e#42361) * Never stop retrying 401s in k8s pod operator * Try reading pod after refreshing credentials * Never stop retrying until credentials are refreshed * Linting --------- Co-authored-by: pmoriello <[email protected]>

paolo-moriello requested review from jedcunningham and hussein-awala as code owners September 20, 2024 09:14

boring-cyborg bot added area:providers provider:cncf-kubernetes Kubernetes provider related issues labels Sep 20, 2024

eladkal requested a review from romsharon98 September 20, 2024 17:59

romsharon98 reviewed Sep 21, 2024

View reviewed changes

airflow/providers/cncf/kubernetes/operators/pod.py Outdated Show resolved Hide resolved

paolo-moriello force-pushed the cncf-k8s-pod-operator-401 branch from f28c328 to 0c0787c Compare September 23, 2024 05:44

romsharon98 approved these changes Sep 23, 2024

View reviewed changes

paolo-moriello force-pushed the cncf-k8s-pod-operator-401 branch from 0c0787c to 389a59a Compare September 23, 2024 12:56

pmoriello added 4 commits September 26, 2024 14:21

Never stop retrying 401s in k8s pod operator

ecb4dbd

Try reading pod after refreshing credentials

1df6d70

Never stop retrying until credentials are refreshed

6cc1484

Linting

815d765

paolo-moriello force-pushed the cncf-k8s-pod-operator-401 branch from 389a59a to 815d765 Compare September 26, 2024 12:21

romsharon98 merged commit 7782050 into apache:main Sep 27, 2024
66 checks passed

paolo-moriello deleted the cncf-k8s-pod-operator-401 branch September 30, 2024 10:55

eladkal mentioned this pull request Oct 10, 2024

Status of testing Providers that were prepared on October 10, 2024 #42882

Closed

96 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KubernetesPodOperator never stops if credentials are refreshed #42361

KubernetesPodOperator never stops if credentials are refreshed #42361

paolo-moriello commented Sep 20, 2024

KubernetesPodOperator never stops if credentials are refreshed #42361

KubernetesPodOperator never stops if credentials are refreshed #42361

Conversation

paolo-moriello commented Sep 20, 2024