Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image builder user defined #3180

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

arbaobao
Copy link
Contributor

@arbaobao arbaobao commented Mar 7, 2025

Tracking issue

Why are the changes needed?

In some situations, there isn't internet to build image from pulling uv and micromamba images from the net.

What changes were proposed in this pull request?

I add an ImageBuilderConfig to detect if there is user defined image_builder in config.yaml.

How was this patch tested?

This patch can be tested by adding image_builder to ~/.flyte/config.yaml

image_builder:
  uv_image: arbaobao/flytekit:uv
  micromamba_image: arbaobao/flyte-test-images:micromamba-amd64

if we don't specify the image, the default images are ghcr.io/astral-sh/uv:0.5.1 and mambaorg/micromamba:2.0.3-debian12-slim.

Setup process

  1. Run a flyte sandbox and add image_builder to the config.yaml.
Screenshot 2025-03-07 at 3 09 38 PM 2. if you are using image_spec to run the workflow, it will use the image that you specified to build the task image. Screenshot 2025-03-07 at 3 09 01 PM

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR adds support for user-defined image builder configuration to enhance flexibility in offline or air-gapped environments. It refactors configuration classes to better separate default from user-defined settings, replacing legacy configuration with new classes including ImageBuilderConfig and updated DataConfig. Docker file templates are revised to allow users to override default images through configuration files.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 2

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 7, 2025

Code Review Agent Run #4675f8

Actionable Suggestions - 2
  • flytekit/image_spec/default_builder.py - 1
    • Consider defining default values for template variables · Line 96-97
  • flytekit/configuration/__init__.py - 1
Review Details
  • Files reviewed - 3 · Commit Range: 44c7cc0..e10a65c
    • flytekit/configuration/__init__.py
    • flytekit/configuration/internal.py
    • flytekit/image_spec/default_builder.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 7, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
New Feature - User-defined Image Builder Configuration

__init__.py - Introduces new dataclass-based configuration classes (DefaultConfig and ImageBuilderConfig) and their auto methods to load user-defined image builder settings.

internal.py - Adds new configuration entries (IMAGE_BUILDER_NAME, UV_IMAGE, and MICROMAMBA_IMAGE) to support the image builder feature.

Feature Improvement - Docker Builder Template Parameterization

default_builder.py - Replaces hardcoded image sources with configurable variables in the Docker template and updates the docker context creation to utilize the user-specified image values.

Comment on lines +96 to +97
FROM $UV_IMAGE as uv
FROM $MICROMAMBA_IMAGE as micromamba
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider defining default values for template variables

Consider defining the default values for UV_IMAGE and MICROMAMBA_IMAGE variables to maintain backward compatibility. Currently, these variables are used in the Dockerfile template but their default values are not defined, which might cause issues if they're not provided when the template is rendered.

Code suggestion
Check the AI-generated fix before applying
Suggested change
FROM $UV_IMAGE as uv
FROM $MICROMAMBA_IMAGE as micromamba
# Default to the previously hardcoded versions if not specified
FROM ${UV_IMAGE:-ghcr.io/astral-sh/uv:0.5.1} as uv
FROM ${MICROMAMBA_IMAGE:-mambaorg/micromamba:2.0.3-debian12-slim} as micromamba

Code Review Run #4675f8


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

@@ -651,6 +670,8 @@ class DataConfig(object):
gcs: GCSConfig = GCSConfig()
azure: AzureBlobStorageConfig = AzureBlobStorageConfig()
generic: GenericPersistenceConfig = GenericPersistenceConfig()
image_builder: ImageBuilderConfig = ImageBuilderConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function call in dataclass default

The default value for image_builder in DataConfig is a function call to ImageBuilderConfig(). Function calls in dataclass defaults can lead to unexpected behavior as they are evaluated only once at definition time.

Code suggestion
Check the AI-generated fix before applying
Suggested change
image_builder: ImageBuilderConfig = ImageBuilderConfig()
image_builder: ImageBuilderConfig = field(default_factory=ImageBuilderConfig)

Code Review Run #4675f8


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Signed-off-by: Nelson Chen <[email protected]>
@dataclass(init=True, repr=True, eq=True, frozen=True)
class ImageBuilderConfig(object):
"""
Any GCS specific configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Any GCS specific configuration.
Any image builder configuration.

micromamba_image: str = "mambaorg/micromamba:2.0.3-debian12-slim"

@classmethod
def auto(cls, config_file: typing.Union[str, ConfigFile] = None) -> GCSConfig:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def auto(cls, config_file: typing.Union[str, ConfigFile] = None) -> GCSConfig:
def auto(cls, config_file: typing.Union[str, ConfigFile] = None) -> ImageBuilderConfig:

@@ -651,6 +670,8 @@ class DataConfig(object):
gcs: GCSConfig = GCSConfig()
azure: AzureBlobStorageConfig = AzureBlobStorageConfig()
generic: GenericPersistenceConfig = GenericPersistenceConfig()
image_builder: ImageBuilderConfig = ImageBuilderConfig()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we put it in the DataConfig?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can make it independent.

arbaobao added 3 commits March 7, 2025 15:31
Signed-off-by: Nelson Chen <[email protected]>
Signed-off-by: Nelson Chen <[email protected]>
Copy link

codecov bot commented Mar 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.22%. Comparing base (2e12f43) to head (a84654b).
Report is 13 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3180      +/-   ##
==========================================
- Coverage   83.58%   77.22%   -6.37%     
==========================================
  Files           3      213     +210     
  Lines         195    22275   +22080     
  Branches        0     2901    +2901     
==========================================
+ Hits          163    17201   +17038     
- Misses         32     4227    +4195     
- Partials        0      847     +847     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@flyte-bot
Copy link
Contributor

flyte-bot commented Mar 7, 2025

Code Review Agent Run #df86ab

Actionable Suggestions - 1
  • flytekit/configuration/__init__.py - 1
Review Details
  • Files reviewed - 3 · Commit Range: e10a65c..a84654b
    • flytekit/configuration/__init__.py
    • flytekit/configuration/internal.py
    • flytekit/image_spec/default_builder.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Any image builder specific configuration.
"""

default: DefaultConfig = DefaultConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function call in dataclass default

The dataclass field default is using a function call DefaultConfig() as a default value, which can lead to unexpected behavior. Consider using field(default_factory=DefaultConfig) instead.

Code suggestion
Check the AI-generated fix before applying
Suggested change
default: DefaultConfig = DefaultConfig()
default: DefaultConfig = field(default_factory=DefaultConfig)

Code Review Run #df86ab


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @arbaobao !

I prefer to have these configuration options in ImageSpec itself (or a way to for ImageSpec to pass in builder specific configuration, like a task_config but for image builders)

Design-wise, when two people running the same image spec, it should produce the same image. If we depend on an external config file for building, then it'll make it harder to debug and reason about.


The options I see are:

  • builder_config
ImageSpec(builder_config={"uv_image": "..."})
  • Add another kwarg
ImageSpec(uv_image="...")

@pingsutw
Copy link
Member

pingsutw commented Mar 8, 2025

@thomasjpfan I see your point, but if users want to use the UV image from a private registry, they must update every image spec

@pingsutw
Copy link
Member

pingsutw commented Mar 8, 2025

or we can probably show the builder config in the error message, will that be helpful for debugging

@thomasjpfan
Copy link
Member

I see your point, but if users want to use the UV image from a private registry, they must update every image spec

I think it is reasonable for users to change all their ImageSpecs if their uv image is in a private registry. I consider considering the uv image to be on the same level as ImageSpec.base_image.

@arbaobao
Copy link
Contributor Author

@pingsutw @thomasjpfan
According to the discuss above, I think I will implement this config in image_spec, but user will have to add this config at every image_spec(xxx) repeatedly.

Is there anything still need to discuss?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants