Skip to content

Releases: apache/airflow

Apache Airflow 2.10.3

05 Nov 14:15
Compare
Choose a tag to compare

Significant Changes

No significant changes.

Bug Fixes

  • Improves the handling of value masking when setting Airflow variables for enhanced security. (#43123) (#43278)
  • Adds support for task_instance_mutation_hook to handle mapped operators with index 0. (#42661) (#43089)
  • Fixes executor cleanup to properly handle zombie tasks when task instances are terminated. (#43065)
  • Adds retry logic for HTTP 502 and 504 errors in internal API calls to handle webserver startup issues. (#42994) (#43044)
  • Restores the use of separate sessions for writing and deleting RTIF data to prevent StaleDataError. (#42928) (#43012)
  • Fixes PythonOperator error by replacing hyphens with underscores in DAG names. (#42993)
  • Improving validation of task retries to handle None values (#42532) (#42915)
  • Fixes error handling in dataset managers when resolving dataset aliases into new datasets (#42733)
  • Enables clicking on task names in the DAG Graph View to correctly select the corresponding task. (#38782) (#42697)
  • Prevent redirect loop on /home with tags/last run filters (#42607) (#42609) (#42628)
  • Support of host.name in OTEL metrics and usage of OTEL_RESOURCE_ATTRIBUTES in metrics (#42428) (#42604)
  • Reduce eyestrain in dark mode with reduced contrast and saturation (#42567) (#42583)
  • Handle ENTER key correctly in trigger form and allow manual JSON (#42525) (#42535)
  • Ensure DAG trigger form submits with updated parameters upon keyboard submit (#42487) (#42499)
  • Do not attempt to provide not stringified objects to UI via xcom if pickling is active (#42388) (#42486)
  • Fix the span link of task instance to point to the correct span in the scheduler_job_loop (#42430) (#42480)
  • Bugfix task execution from runner in Windows (#42426) (#42478)
  • Allows overriding the hardcoded OTEL_SERVICE_NAME with an environment variable (#42242) (#42441)
  • Improves trigger performance by using selectinload instead of joinedload (#40487) (#42351)
  • Suppress warnings when masking sensitive configs (#43335) (#43337)
  • Masking configuration values irrelevant to DAG author (#43040) (#43336)
  • Execute templated bash script as file in BashOperator (#43191)
  • Fixes schedule_downstream_tasks to include upstream tasks for one_success trigger rule (#42582) (#43299)
  • Add retry logic in the scheduler for updating trigger timeouts in case of deadlocks. (#41429) (#42651)
  • Mark all tasks as skipped when failing a dag_run manually (#43572)
  • Fix TrySelector for Mapped Tasks in Logs and Details Grid Panel (#43566)
  • Conditionally add OTEL events when processing executor events (#43558) (#43567)
  • Fix broken stat scheduler_loop_duration (#42886) (#43544)
  • Ensure total_entries in /api/v1/dags (#43377) (#43429)
  • Include limit and offset in request body schema for List task instances (batch) endpoint (#43479)
  • Don't raise a warning in ExecutorSafeguard when execute is called from an extended operator (#42849) (#43577)

Miscellaneous

  • Deprecate session auth backend (#42911)
  • Removed unicodecsv dependency for providers with Airflow version 2.8.0 and above (#42765) (#42970)
  • Remove the referrer from Webserver to Scarf (#42901) (#42942)
  • Bump dompurify from 2.2.9 to 2.5.6 in /airflow/www (#42263) (#42270)
  • Correct docstring format in _get_template_context (#42244) (#42272)
  • Backport: Bump Flask-AppBuilder to 4.5.2 (#43309) (#43318)
  • Check python version that was used to install pre-commit venvs (#43282) (#43310)
  • Resolve warning in Dataset Alias migration (#43425)

Doc Only Changes

  • Clarifying PLUGINS_FOLDER permissions by DAG authors (#43022) (#43029)
  • Add templating info to TaskFlow tutorial (#42992)
  • Airflow local settings no longer importable from dags folder (#42231) (#42603)
  • Fix documentation for cpu and memory usage (#42147) (#42256)
  • Fix instruction for docker compose (#43119) (#43321)
  • Updates documentation to reflect that dag_warnings is returned instead of import_errors. (#42858) (#42888)

Apache Airflow 2.10.2

20 Sep 23:01
2.10.2
Compare
Choose a tag to compare

Significant Changes

No significant changes.

Bug Fixes

  • Revert "Fix: DAGs are not marked as stale if the dags folder change" (#42220, #42217)
  • Add missing open telemetry span and correct scheduled slots documentation (#41985)
  • Fix require_confirmation_dag_change (#42063) (#42211)
  • Only treat null/undefined as falsy when rendering XComEntry (#42199) (#42213)
  • Add extra and renderedTemplates as keys to skip camelCasing (#42206) (#42208)
  • Do not camelcase xcom entries (#42182) (#42187)
  • Fix task_instance and dag_run links from list views (#42138) (#42143)
  • Support multi-line input for Params of type string in trigger UI form (#40414) (#42139)
  • Fix details tab log url detection (#42104) (#42114)
  • Add new type of exception to catch timeout (#42064) (#42078)
  • Rewrite how DAG to dataset / dataset alias are stored (#41987) (#42055)
  • Allow dataset alias to add more than one dataset events (#42189) (#42247)

Miscellaneous

  • Limit universal-pathlib below 0.2.4 as it breaks our integration (#42101)
  • Auto-fix default deferrable with LibCST (#42089)
  • Deprecate --tree flag for tasks list cli command (#41965)

Doc Only Changes

  • Update security_model.rst to clear unauthenticated endpoints exceptions (#42085)
  • Add note about dataclasses and attrs to XComs page (#42056)
  • Improve docs on markdown docs in DAGs (#42013)
  • Add warning that listeners can be dangerous (#41968)

Apache Airflow 2.10.1

06 Sep 11:30
Compare
Choose a tag to compare

Significant Changes

No significant changes.

Bug Fixes

  • Handle Example dags case when checking for missing files (#41874)
  • Fix logout link in "no roles" error page (#41845)
  • Set end_date and duration for triggers completed with end_from_trigger as True. (#41834)
  • DAGs are not marked as stale if the dags folder change (#41829)
  • Fix compatibility with FAB provider versions <1.3.0 (#41809)
  • Don't Fail LocalTaskJob on heartbeat (#41810)
  • Remove deprecation warning for cgitb in Plugins Manager (#41793)
  • Fix log for notifier(instance) without name (#41699)
  • Splitting syspath preparation into stages (#41694)
  • Adding url sanitization for extra links (#41680)
  • Fix InletEventsAccessors type stub (#41607)
  • Fix UI rendering when XCom is INT, FLOAT, BOOL or NULL (#41605)
  • Fix try selector refresh (#41503)
  • Incorrect try number subtraction producing invalid span id for OTEL airflow (#41535)
  • Add WebEncoder for trigger page rendering to avoid render failure (#41485)
  • Adding tojson filter to example_inlet_event_extra example dag (#41890)
  • Add backward compatibility check for executors that don't inherit BaseExecutor (#41927)

Miscellaneous

  • Bump webpack from 5.76.0 to 5.94.0 in /airflow/www (#41879)
  • Adding rel property to hyperlinks in logs (#41783)
  • Field Deletion Warning when editing Connections (#41504)
  • Make Scarf usage reporting in major+minor versions and counters in buckets (#41900)
  • Lower down universal-pathlib minimum to 0.2.2 (#41943)
  • Protect against None components of universal pathlib xcom backend (#41938)

Doc Only Changes

  • Remove Debian bullseye support (#41569)
  • Add an example for auth with keycloak (#41791)

Apache Airflow 2.10.0

16 Aug 01:53
2.10.0
Compare
Choose a tag to compare

Significant Changes

Datasets no longer trigger inactive DAGs (#38891)

Previously, when a DAG is paused or removed, incoming dataset events would still
trigger it, and the DAG would run when it is unpaused or added back in a DAG
file. This has been changed; a DAG's dataset schedule can now only be satisfied
by events that occur when the DAG is active. While this is a breaking change,
the previous behavior is considered a bug.

The behavior of time-based scheduling is unchanged, including the timetable part
of DatasetOrTimeSchedule.

try_number is no longer incremented during task execution (#39336)

Previously, the try number (try_number) was incremented at the beginning of task execution on the worker. This was problematic for many reasons.
For one it meant that the try number was incremented when it was not supposed to, namely when resuming from reschedule or deferral. And it also resulted in
the try number being "wrong" when the task had not yet started. The workarounds for these two issues caused a lot of confusion.

Now, instead, the try number for a task run is determined at the time the task is scheduled, and does not change in flight, and it is never decremented.
So after the task runs, the observed try number remains the same as it was when the task was running; only when there is a "new try" will the try number be incremented again.

One consequence of this change is, if users were "manually" running tasks (e.g. by calling ti.run() directly, or command line airflow tasks run),
try number will no longer be incremented. Airflow assumes that tasks are always run after being scheduled by the scheduler, so we do not regard this as a breaking change.

/logout endpoint in FAB Auth Manager is now CSRF protected (#40145)

The /logout endpoint's method in FAB Auth Manager has been changed from GET to POST in all existing
AuthViews (AuthDBView, AuthLDAPView, AuthOAuthView, AuthOIDView, AuthRemoteUserView), and
now includes CSRF protection to enhance security and prevent unauthorized logouts.

OpenTelemetry Traces for Apache Airflow (#37948).

This new feature adds capability for Apache Airflow to emit 1) airflow system traces of scheduler,
triggerer, executor, processor 2) DAG run traces for deployed DAG runs in OpenTelemetry format. Previously, only metrics were supported which emitted metrics in OpenTelemetry.
This new feature will add richer data for users to use OpenTelemetry standard to emit and send their trace data to OTLP compatible endpoints.

Decorator for Task Flow (@skip_if, @run_if) to make it simple to apply whether or not to skip a Task. (#41116)

This feature adds a decorator to make it simple to skip a Task.

Using Multiple Executors Concurrently (#40701)

Previously known as hybrid executors, this new feature allows Airflow to use multiple executors concurrently. DAGs, or even individual tasks, can be configured
to use a specific executor that suits its needs best. A single DAG can contain tasks all using different executors. Please see the Airflow documentation for
more details. Note: This feature is still experimental. See documentation on Executor <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/executor/index.html#using-multiple-executors-concurrently>_ for a more detailed description.

Scarf based telemetry: Does Airflow collect any telemetry data? (#39510)

Airflow integrates Scarf to collect basic usage data during operation. Deployments can opt-out of data collection by setting the [usage_data_collection]enabled option to False, or the SCARF_ANALYTICS=false environment variable.
See FAQ on this <https://airflow.apache.org/docs/apache-airflow/stable/faq.html#does-airflow-collect-any-telemetry-data>_ for more information.

New Features

  • AIP-61 Hybrid Execution (AIP-61)
  • AIP-62 Getting Lineage from Hook Instrumentation (AIP-62)
  • AIP-64 TaskInstance Try History (AIP-64)
  • AIP-44 Internal API (AIP-44)
  • Enable ending the task directly from the triggerer without going into the worker. (#40084)
  • Extend dataset dependencies (#40868)
  • Feature/add token authentication to internal api (#40899)
  • Add DatasetAlias to support dynamic Dataset Event Emission and Dataset Creation (#40478)
  • Add example DAGs for inlet_events (#39893)
  • Implement accessors to read dataset events defined as inlet (#39367)
  • Decorator for Task Flow, to make it simple to apply whether or not to skip a Task. (#41116)
  • Add start execution from triggerer support to dynamic task mapping (#39912)
  • Add try_number to log table (#40739)
  • Added ds_format_locale method in macros which allows localizing datetime formatting using Babel (#40746)
  • Add DatasetAlias to support dynamic Dataset Event Emission and Dataset Creation (#40478, #40723, #40809, #41264, #40830, #40693, #41302)
  • Use sentinel to mark dag as removed on re-serialization (#39825)
  • Add parameter for the last number of queries to the DB in DAG file processing stats (#40323)
  • Add prototype version dark mode for Airflow UI (#39355)
  • Add ability to mark some tasks as successful in dag test (#40010)
  • Allow use of callable for template_fields (#37028)
  • Filter running/failed and active/paused dags on the home page(#39701)
  • Add metrics about task CPU and memory usage (#39650)
  • UI changes for DAG Re-parsing feature (#39636)
  • Add Scarf based telemetry (#39510, #41318)
  • Add dag re-parsing request endpoint (#39138)
  • Redirect to new DAGRun after trigger from Grid view (#39569)
  • Display endDate in task instance tooltip. (#39547)
  • Implement accessors to read dataset events defined as inlet (#39367, #39893)
  • Add color to log lines in UI for error and warnings based on keywords (#39006)
  • Add Rendered k8s pod spec tab to ti details view (#39141)
  • Make audit log before/after filterable (#39120)
  • Consolidate grid collapse actions to a single full screen toggle (#39070)
  • Implement Metadata to emit runtime extra (#38650)
  • Add executor field to the DB and parameter to the operators (#38474)
  • Implement context accessor for DatasetEvent extra (#38481)
  • Add dataset event info to dag graph (#41012)
  • Add button to toggle datasets on/off in dag graph (#41200)
  • Add run_if & skip_if decorators (#41116)
  • Add dag_stats rest api endpoint (#41017)
  • Add listeners for Dag import errors (#39739)
  • Allowing DateTimeSensorAsync, FileSensor and TimeSensorAsync to start execution from trigger during dynamic task mapping (#41182)

Improvements

  • Allow set Dag Run resource into Dag Level permission: extends Dag's access_control feature to allow Dag Run resource permissions. (#40703)
  • Improve security and error handling for the internal API (#40999)
  • Datasets UI Improvements (#40871)
  • Change DAG Audit log tab to Event Log (#40967)
  • Make standalone dag file processor works in DB isolation mode (#40916)
  • Show only the source on the consumer DAG page and only triggered DAG run in the producer DAG page (#41300)
  • Update metrics names to allow multiple executors to report metrics (#40778)
  • Format DAG run count (#39684)
  • Update styles for renderedjson component (#40964)
  • Improve ATTRIBUTE_REMOVED sentinel to use class and more context (#40920)
  • Make XCom display as react json (#40640)
  • Replace usages of task context logger with the log table (#40867)
  • Rollback for all retry exceptions (#40882) (#40883)
  • Support rendering ObjectStoragePath value (#40638)
  • Add try_number and map_index as params for log event endpoint (#40845)
  • Rotate fernet key in batches to limit memory usage (#40786)
  • Add gauge metric for 'last_num_of_db_queries' parameter (#40833)
  • Set parallelism log messages to warning level for better visibility (#39298)
  • Add error handling for encoding the dag runs (#40222)
  • Use params instead of dag_run.conf in example DAG (#40759)
  • Load Example Plugins with Example DAGs (#39999)
  • Stop deferring TimeDeltaSensorAsync task when the target_dttm is in the past (#40719)
  • Send important executor logs to task logs (#40468)
  • Open external links in new tabs (#40635)
  • Attempt to add ReactJSON view to rendered templates (#40639)
  • Speeding up regex match time for custom warnings (#40513)
  • Refactor DAG.dataset_triggers into the timetable class (#39321)
  • add next_kwargs to StartTriggerArgs (#40376)
  • Improve UI error handling (#40350)
  • Remove double warning in CLI when config value is deprecated (#40319)
  • Implement XComArg concat() (#40172)
  • Added get_extra_dejson method with nested parameter which allows you to specify if you want the nested json as string to be also deserialized (#39811)
  • Add executor field to the task instance API (#40034)
  • Support checking for db path absoluteness on Windows (#40069)
  • Introduce StartTriggerArgs and prevent start trigger initialization in scheduler (#39585)
  • Add task documentation to details tab in grid view (#39899)
  • Allow executors to be specified with only the class name of the Executor (#40131)
  • Remove obsolete conditional logic related to try_number (#40104)
  • Allow Task Group Ids to be passed as branches in BranchMixIn (#38883)
  • Javascript connection form will apply CodeMirror to all textarea's dynamically (#39812)
  • Determine needs_expansion at time of serialization (#39604)
  • Add indexes on dag_id column in referencing tables to speed up deletion of dag records (#39638)
  • ...
Read more

Apache Airflow Helm Chart 1.15.0

24 Jul 12:45
helm-chart/1.15.0
ff7463b
Compare
Choose a tag to compare

Significant Changes

Default Airflow image is updated to 2.9.3 (#40816)

The default Airflow image that is used with the Chart is now 2.9.3, previously it was 2.9.2.

Default PgBouncer Exporter image has been updated (#40318)

The PgBouncer Exporter image has been updated to airflow-pgbouncer-exporter-2024.06.18-0.17.0, which addresses CVE-2024-24786.

New Features

  • Add git-sync container lifecycle hooks (#40369)
  • Add init containers for jobs (#40454)
  • Add persistent volume claim retention policy (#40271)
  • Add annotations for Redis StatefulSet (#40281)
  • Add dags.gitSync.sshKey, which allows the git-sync private key to be configured in the values file directly (#39936)
  • Add extraEnvFrom to git-sync containers (#39031)

Improvements

  • Link in UIAlert to production guide when a dynamic webserver secret is used now opens in a new tab (#40635)
  • Support disabling helm hooks on extraConfigMaps and extraSecrets (#40294)

Bug Fixes

  • Add git-sync ssh secret to DAG processor (#40691)
  • Fix duplicated safeToEvict annotations (#40554)
  • Add missing triggerer.keda.usePgbouncer to values.yaml (#40614)
  • Trim leading // character using mysql backend (#40401)

Doc only changes

  • Updating chart download link to use the Apache download CDN (#40618)

Misc

  • Update PgBouncer exporter image to airflow-pgbouncer-exporter-2024.06.18-0.17.0 (#40318)
  • Default airflow version to 2.9.3 (#40816)
  • Fix startupProbe timing comment (#40412)

Apache Airflow 2.9.3

16 Jul 11:30
81845de
Compare
Choose a tag to compare

Significant Changes

Time unit for scheduled_duration and queued_duration changed (#37936)

scheduled_duration and queued_duration metrics are now emitted in milliseconds instead of seconds.

By convention all statsd metrics should be emitted in milliseconds, this is later expected in e.g. prometheus statsd-exporter.

Support for OpenTelemetry Metrics is no longer "Experimental" (#40286)

Experimental support for OpenTelemetry was added in 2.7.0 since then fixes and improvements were added and now we announce the feature as stable.

Bug Fixes

  • Fix calendar view scroll (#40458)
  • Validating provider description for urls in provider list view (#40475)
  • Fix compatibility with old MySQL 8.0 (#40314)
  • Fix dag (un)pausing won't work on environment where dag files are missing (#40345)
  • Extra being passed to SQLalchemy (#40391)
  • Handle unsupported operand int + str when value of tag is int (job_id) (#40407)
  • Fix TriggeredDagRunOperator triggered link (#40336)
  • Add [webserver]update_fab_perms to deprecated configs (#40317)
  • Swap dag run link from legacy graph to grid with graph tab (#40241)
  • Change httpx to requests in file_task_handler (#39799)
  • Fix import future annotations in venv jinja template (#40208)
  • Ensures DAG params order regardless of backend (#40156)
  • Use a join for TI notes in TI batch API endpoint (#40028)
  • Improve trigger UI for string array format validation (#39993)
  • Disable jinja2 rendering for doc_md (#40522)
  • Skip checking sub dags list if taskinstance state is skipped (#40578)
  • Recognize quotes when parsing urls in logs (#40508)

Doc Only Changes

  • Add notes about passing secrets via environment variables (#40519)
  • Revamp some confusing log messages (#40334)
  • Add more precise description of masking sensitive field names (#40512)
  • Add slightly more detailed guidance about upgrading to the docs (#40227)
  • Metrics allow_list complete example (#40120)
  • Add warning to deprecated api docs that access control isn't applied (#40129)
  • Simpler command to check local scheduler is alive (#40074)
  • Add a note and an example clarifying the usage of DAG-level params (#40541)
  • Fix highlight of example code in dags.rst (#40114)
  • Add warning about the PostgresOperator being deprecated (#40662)
  • Updating airflow download links to CDN based links (#40618)
  • Fix import statement for DatasetOrTimetable example (#40601)
  • Further clarify triage process (#40536)
  • Fix param order in PythonOperator docstring (#40122)
  • Update serializers.rst to mention that bytes are not supported (#40597)

Miscellaneous

  • Upgrade build installers and dependencies (#40177)
  • Bump braces from 3.0.2 to 3.0.3 in /airflow/www (#40180)
  • Upgrade to another version of trove-classifier (new CUDA classifiers) (#40564)
  • Rename "try_number" increments that are unrelated to the airflow concept (#39317)
  • Update trove classifiers to the latest version as build dependency (#40542)
  • Upgrade to latest version of hatchling as build dependency (#40387)
  • Fix bug in SchedulerJobRunner._process_executor_events (#40563)
  • Remove logging for "blocked" events (#40446)

Apache Airflow Helm Chart 1.14.0

19 Jun 00:26
helm-chart/1.14.0
8eebe2b
Compare
Choose a tag to compare

Significant Changes

ClusterRole and ClusterRoleBinding names have been updated to be unique (#37197)

ClusterRoles and ClusterRoleBindings created when multiNamespaceMode is enabled have been renamed to ensure unique names:

  • {{ include "airflow.fullname" . }}-pod-launcher-role has been renamed to {{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-launcher-role
  • {{ include "airflow.fullname" . }}-pod-launcher-rolebinding has been renamed to {{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-launcher-rolebinding
  • {{ include "airflow.fullname" . }}-pod-log-reader-role has been renamed to {{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-log-reader-role
  • {{ include "airflow.fullname" . }}-pod-log-reader-rolebinding has been renamed to {{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-pod-log-reader-rolebinding
  • {{ include "airflow.fullname" . }}-scc-rolebinding has been renamed to {{ .Release.Namespace }}-{{ include "airflow.fullname" . }}-scc-rolebinding

workers.safeToEvict default changed to False (#40229)

The default for workers.safeToEvict now defaults to False. This is a safer default
as it prevents the nodes workers are running on from being scaled down by the
K8s Cluster Autoscaler <https://kubernetes.io/docs/concepts/cluster-administration/cluster-autoscaling/#cluster-autoscaler>_.
If you would like to retain the previous behavior, you can set this config to True.

Default Airflow image is updated to 2.9.2 (#40160)

The default Airflow image that is used with the Chart is now 2.9.2, previously it was 2.8.3.

Default StatsD image is updated to v0.26.1 (#38416)

The default StatsD image that is used with the Chart is now v0.26.1, previously it was v0.26.0.

New Features

  • Enable MySQL KEDA support for triggerer (#37365)
  • Allow AWS Executors (#38524)

Improvements

  • Allow valueFrom in env config of components (#40135)
  • Enable templating in extraContainers and extraInitContainers (#38507)
  • Add safe-to-evict annotation to pod-template-file (#37352)
  • Support workers.command for KubernetesExecutor (#39132)
  • Add priorityClassName to Jobs (#39133)
  • Add Kerberos sidecar to pod-template-file (#38815)
  • Add templated field support for extra containers (#38510)

Bug Fixes

  • Set workers.safeToEvict default to False (#40229)

Doc only changes

  • Document extraContainers and extraInitContainers that are templated (#40033)
  • Fix typo in HorizontalPodAutoscaling documentation (#39307)
  • Fix supported k8s versions in docs (#39172)
  • Fix typo in YAML path for brokerUrlSecretName (#39115)

Misc

  • Default Airflow version to 2.9.2 (#40160)
  • Limit Redis image to 7.2 (#38928)
  • Build Helm values schemas with Kubernetes 1.29 resources (#38460)
  • Add missing containers to resources docs (#38534)
  • Upgrade StatsD Exporter image to 0.26.1 (#38416)
  • Remove K8S 1.25 support (#38367)

Apache Airflow 2.9.2

10 Jun 10:58
f56f134
Compare
Choose a tag to compare

Significant Changes

No significant changes.

Bug Fixes

  • Fix bug that makes AirflowSecurityManagerV2 leave transactions in the idle in transaction state (#39935)
  • Fix alembic auto-generation and rename mismatching constraints (#39032)
  • Add the existing_nullable to the downgrade side of the migration (#39374)
  • Fix Mark Instance state buttons stay disabled if user lacks permission (#37451). (#38732)
  • Use SKIP LOCKED instead of NOWAIT in mini scheduler (#39745)
  • Remove DAG Run Add option from FAB view (#39881)
  • Add max_consecutive_failed_dag_runs in API spec (#39830)
  • Fix example_branch_operator failing in python 3.12 (#39783)
  • Fetch served logs also when task attempt is up for retry and no remote logs available (#39496)
  • Change dataset URI validation to raise warning instead of error in Airflow 2.9 (#39670)
  • Visible DAG RUN doesn't point to the same dag run id (#38365)
  • Refactor SafeDogStatsdLogger to use get_validator to enable pattern matching (#39370)
  • Fix custom actions in security manager has_access (#39421)
  • Fix HTTP 500 Internal Server Error if DAG is triggered with bad params (#39409)
  • Fix static file caching is disabled in Airflow Webserver. (#39345)
  • Fix TaskHandlerWithCustomFormatter now adds prefix only once (#38502)
  • Do not provide deprecated execution_date in @apply_lineage (#39327)
  • Add missing conn_id to string representation of ObjectStoragePath (#39313)
  • Fix sql_alchemy_engine_args config example (#38971)
  • Add Cache-Control "no-store" to all dynamically generated content (#39550)

Miscellaneous

  • Limit yandex provider to avoid mypy errors (#39990)
  • Warn on mini scheduler failures instead of debug (#39760)
  • Change type definition for provider_info_cache decorator (#39750)
  • Better typing for BaseOperator defer (#39742)
  • More typing in TimeSensor and TimeSensorAsync (#39696)
  • Re-raise exception from strict dataset URI checks (#39719)
  • Fix stacklevel for _log_state helper (#39596)
  • Resolve SA warnings in migrations scripts (#39418)
  • Remove unused index idx_last_scheduling_decision on dag_run table (#39275)

Doc Only Changes

  • Provide extra tip on labeling DynamicTaskMapping (#39977)
  • Improve visibility of links / variables / other configs in Configuration Reference (#39916)
  • Remove 'legacy' definition for CronDataIntervalTimetable (#39780)
  • Update plugins.rst examples to use pyproject.toml over setup.py (#39665)
  • Fix nit in pg set-up doc (#39628)
  • Add Matomo to Tracking User Activity docs (#39611)
  • Fix Connection.get -> Connection. get_connection_from_secrets (#39560)
  • Adding note for provider dependencies (#39512)
  • Update docker-compose command (#39504)
  • Update note about restarting triggerer process (#39436)
  • Updating S3LogLink with an invalid bucket link (#39424)
  • Update testing_packages.rst (#38996)
  • Add multi-team diagrams (#38861)

Apache Airflow 2.9.1

06 May 10:30
2.9.1
2d53c10
Compare
Choose a tag to compare

Significant Changes

Stackdriver logging bugfix requires Google provider 10.17.0 or later (#38071)

If you use Stackdriver logging, you must use Google provider version 10.17.0 or later. Airflow 2.9.1 now passes gcp_log_name to the StackdriverTaskHandler instead of name, and this will fail on earlier provider versions.

This fixes a bug where the log name configured in [logging] remove_base_log_folder was overridden when Airflow configured logging, resulting in task logs going to the wrong destination.

Bug Fixes

  • Make task log messages include run_id (#39280)
  • Copy menu_item href for nav bar (#39282)
  • Fix trigger kwarg encryption migration (#39246, #39361, #39374)
  • Add workaround for datetime-local input in firefox (#39261)
  • Add Grid button to Task Instance view (#39223)
  • Get served logs when remote or executor logs not available for non-running task try (#39177)
  • Fixed side effect of menu filtering causing disappearing menus (#39229)
  • Use grid view for Task Instance's log_url (#39183)
  • Improve task filtering UX (#39119)
  • Improve rendered_template ux in react dag page (#39122)
  • Graph view improvements (#38940)
  • Check that the dataset<>task exists before trying to render graph (#39069)
  • Hostname was "redacted", not "redact"; remove it when there is no context (#39037)
  • Check whether AUTH_ROLE_PUBLIC is set in check_authentication (#39012)
  • Move rendering of map_index_template so it renders for failed tasks as long as it was defined before the point of failure (#38902)
  • Undeprecate BaseXCom.get_one method for now (#38991)
  • Add inherit_cache attribute for CreateTableAs custom SA Clause (#38985)
  • Don't wait for DagRun lock in mini scheduler (#38914)
  • Fix calendar view with no DAG Run (#38964)
  • Changed the background color of external task in graph (#38969)
  • Fix dag run selection (#38941)
  • Fix SAWarning 'Coercing Subquery object into a select() for use in IN()' (#38926)
  • Fix implicit cartesian product in AirflowSecurityManagerV2 (#38913)
  • Fix problem that links in legacy log view can not be clicked (#38882)
  • Fix dag run link params (#38873)
  • Use async db calls in WorkflowTrigger (#38689)
  • Fix audit log events filter (#38719)
  • Use methodtools.lru_cache instead of functools.lru_cache in class methods (#37757)
  • Raise deprecated warning in airflow dags backfill only if -I / --ignore-first-depends-on-past provided (#38676)

Miscellaneous

  • TriggerDagRunOperator deprecate execution_date in favor of logical_date (#39285)
  • Force to use Airflow Deprecation warnings categories on @deprecated decorator (#39205)
  • Add warning about run/import Airflow under the Windows (#39196)
  • Update is_authorized_custom_view from auth manager to handle custom actions (#39167)
  • Add in Trove classifiers Python 3.12 support (#39004)
  • Use debug level for minischeduler skip (#38976)
  • Bump undici from 5.28.3 to 5.28.4 in /airflow/www (#38751)

Doc Only Changes

  • Fix supported k8s version in docs (#39172)
  • Dynamic task mapping PythonOperator op_kwargs (#39242)
  • Add link to user and role commands (#39224)
  • Add k8s 1.29 to supported version in docs (#39168)
  • Data aware scheduling docs edits (#38687)
  • Update DagBag class docstring to include all params (#38814)
  • Correcting an example taskflow example (#39015)
  • Remove decorator from rendering fields example (#38827)

Apache Airflow 2.9.0

08 Apr 12:11
2.9.0
50f22ff
Compare
Choose a tag to compare

Significant Changes

Following Listener API methods are considered stable and can be used for production system (were experimental feature in older Airflow versions) (#36376):

Lifecycle events:

  • on_starting
  • before_stopping

DagRun State Change Events:

  • on_dag_run_running
  • on_dag_run_success
  • on_dag_run_failed

TaskInstance State Change Events:

  • on_task_instance_running
  • on_task_instance_success
  • on_task_instance_failed

Support for Microsoft SQL-Server for Airflow Meta Database has been removed (#36514)

After discussion <https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4>__
and a voting process <https://lists.apache.org/thread/pgcgmhf6560k8jbsmz8nlyoxosvltph2>__,
the Airflow's PMC and Committers have reached a resolution to no longer maintain MsSQL as a supported Database Backend.

As of Airflow 2.9.0 support of MsSQL has been removed for Airflow Database Backend.

A migration script which can help migrating the database before upgrading to Airflow 2.9.0 is available in
airflow-mssql-migration repo on Github <https://github.com/apache/airflow-mssql-migration>_.
Note that the migration script is provided without support and warranty.

This does not affect the existing provider packages (operators and hooks), DAGs can still access and process data from MsSQL.

Dataset URIs are now validated on input (#37005)

Datasets must use a URI that conform to rules laid down in AIP-60, and the value
will be automatically normalized when the DAG file is parsed. See
documentation on Datasets <https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/datasets.html>_ for
a more detailed description on the rules.

You may need to change your Dataset identifiers if they look like a URI, but are
used in a less mainstream way, such as relying on the URI's auth section, or
have a case-sensitive protocol name.

The method get_permitted_menu_items in BaseAuthManager has been renamed filter_permitted_menu_items (#37627)

Add REST API actions to Audit Log events (#37734)

The Audit Log event name for REST API events will be prepended with api. or ui., depending on if it came from the Airflow UI or externally.

Official support for Python 3.12 (#38025)

There are a few caveats though:

  • Pendulum2 does not support Python 3.12. For Python 3.12 you need to use
    Pendulum 3 <https://pendulum.eustace.io/blog/announcing-pendulum-3-0-0.html>_

  • Minimum SQLAlchemy version supported when Pandas is installed for Python 3.12 is 1.4.36 released in
    April 2022. Airflow 2.9.0 increases the minimum supported version of SQLAlchemy to 1.4.36 for all
    Python versions.

Not all Providers support Python 3.12. At the initial release of Airflow 2.9.0 the following providers
are released without support for Python 3.12:

  • apache.beam - pending on Apache Beam support for 3.12 <https://github.com/apache/beam/issues/29149>_
  • papermill - pending on Releasing Python 3.12 compatible papermill client version
    including this merged issue <https://github.com/nteract/papermill/pull/771>_

Prevent large string objects from being stored in the Rendered Template Fields (#38094)

There's now a limit to the length of data that can be stored in the Rendered Template Fields.
The limit is set to 4096 characters. If the data exceeds this limit, it will be truncated. You can change this limit
by setting the [core]max_template_field_length configuration option in your airflow config.

Change xcom table column value type to longblob for MySQL backend (#38401)

Xcom table column value type has changed from blob to longblob. This will allow you to store relatively big data in Xcom but process can take a significant amount of time if you have a lot of large data stored in Xcom.

To downgrade from revision: b4078ac230a1, ensure that you don't have Xcom values larger than 65,535 bytes. Otherwise, you'll need to clean those rows or run airflow db clean xcom to clean the Xcom table.

New Features

  • Allow users to write dag_id and task_id in their national characters, added display name for dag / task (v2) (#38446)
  • Prevent large objects from being stored in the RTIF (#38094)
  • Use current time to calculate duration when end date is not present. (#38375)
  • Add average duration mark line in task and dagrun duration charts. (#38214, #38434)
  • Add button to manually create dataset events (#38305)
  • Add Matomo as an option for analytics_tool. (#38221)
  • Experimental: Support custom weight_rule implementation to calculate the TI priority_weight (#38222)
  • Adding ability to automatically set DAG to off after X times it failed sequentially (#36935)
  • Add dataset conditions to next run datasets modal (#38123)
  • Add task log grouping to UI (#38021)
  • Add dataset_expression to grid dag details (#38121)
  • Introduce mechanism to support multiple executor configuration (#37635)
  • Add color formatting for ANSI chars in logs from task executions (#37985)
  • Add the dataset_expression as part of DagModel and DAGDetailSchema (#37826)
  • Add TaskFail entries to Gantt chart (#37918)
  • Allow longer rendered_map_index (#37798)
  • Inherit the run_ordering from DatasetTriggeredTimetable for DatasetOrTimeSchedule (#37775)
  • Implement AIP-60 Dataset URI formats (#37005)
  • Introducing Logical Operators for dataset conditional logic (#37101)
  • Add post endpoint for dataset events (#37570)
  • Show custom instance names for a mapped task in UI (#36797)
  • Add excluded/included events to get_event_logs api (#37641)
  • Add datasets to dag graph (#37604)
  • Show dataset events above task/run details in grid view (#37603)
  • Introduce new config variable to control whether DAG processor outputs to stdout (#37439)
  • Make Datasets hashable (#37465)
  • Add conditional logic for dataset triggering (#37016)
  • Implement task duration page in react. (#35863)
  • Add queuedEvent endpoint to get/delete DatasetDagRunQueue (#37176)
  • Support multiple XCom output in the BaseOperator (#37297)
  • AIP-58: Add object storage backend for xcom (#37058)
  • Introduce DatasetOrTimeSchedule (#36710)
  • Add on_skipped_callback to BaseOperator (#36374)
  • Allow override of hovered navbar colors (#36631)
  • Create new Metrics with Tagging (#36528)
  • Add support for openlineage to AFS and common.io (#36410)
  • Introduce @task.bash TaskFlow decorator (#30176, #37875)
  • Added functionality to automatically ingest custom airflow.cfg file upon startup (#36289)

Improvements

  • More human friendly "show tables" output for db cleanup (#38654)
  • Improve trigger assign_unassigned by merging alive_triggerer_ids and get_sorted_triggers queries (#38664)
  • Add exclude/include events filters to audit log (#38506)
  • Clean up unused triggers in a single query for all dialects except MySQL (#38663)
  • Update Confirmation Logic for Config Changes on Sensitive Environments Like Production (#38299)
  • Improve datasets graph UX (#38476)
  • Only show latest dataset event timestamp after last run (#38340)
  • Add button to clear only failed tasks in a dagrun. (#38217)
  • Delete all old dag pages and redirect to grid view (#37988)
  • Check task attribute before use in sentry.add_tagging() (#37143)
  • Mysql change xcom value col type for MySQL backend (#38401)
  • ExternalPythonOperator use version from sys.version_info (#38377)
  • Replace too broad exceptions into the Core (#38344)
  • Add CLI support for bulk pause and resume of DAGs (#38265)
  • Implement methods on TaskInstancePydantic and DagRunPydantic (#38295, #38302, #38303, #38297)
  • Made filters bar collapsible and add a full screen toggle (#38296)
  • Encrypt all trigger attributes (#38233, #38358, #38743)
  • Upgrade react-table package. Use with Audit Log table (#38092)
  • Show if dag page filters are active (#38080)
  • Add try number to mapped instance (#38097)
  • Add retries to job heartbeat (#37541)
  • Add REST API events to Audit Log (#37734)
  • Make current working directory as templated field in BashOperator (#37968)
  • Add calendar view to react (#37909)
  • Add run_id column to log table (#37731)
  • Add tryNumber to grid task instance tooltip (#37911)
  • Session is not used in _do_render_template_fields (#37856)
  • Improve MappedOperator property types (#37870)
  • Remove provide_session decorator from TaskInstancePydantic methods (#37853)
  • Ensure the "airflow.task" logger used for TaskInstancePydantic and TaskInstance (#37857)
  • Better error message for internal api call error (#37852)
  • Increase tooltip size of dag grid view (#37782) (#37805)
  • Use named loggers instead of root logger (#37801)
  • Add Run Duration in React (#37735)
  • Avoid non-recommended usage of logging (#37792)
  • Improve DateTimeTrigger typing (#37694)
  • Make sure all unique run_ids render a task duration bar (#37717)
  • Add Dag Audit Log to React (#37682)
  • Add log event for auto pause (#38243)
  • Better message for exception for templated base operator fields (#37668)
  • Clean up webserver endpoints adding to audit log (#37580)
  • Filter datasets graph by dag_id (#37464)
  • Use new exception type inheriting BaseException for SIGTERMs (#37613)
  • Refactor dataset class inheritance (#37590)
  • Simplify checks for package versions (#37585)
  • Filter Datasets by associated dag_ids (GET /datasets) (#37512)
  • Enable "airflow tasks test" to run deferrable operator (#37542)
  • Make datasets list/graph width adjustable (#37425)
  • Speedup determine installed airflow version in ExternalPythonOperator (#37409)
  • Add more task details from rest api (#37394)
  • Add confirmation dialog box for DAG run actions (#35393)
  • Added shutdown color to the STATE_COLORS (#37295)
  • Remove legacy dag details page and redirect to grid (#37232)
  • Order XCom entries by map index in API (#37086...
Read more