-
Notifications
You must be signed in to change notification settings - Fork 728
Open
Description
URL(s) with the issue:
https://www.tensorflow.org/tfx/api_docs/python/tfx/v1/dsl/Artifact
Description of issue (what needs changing):
Clear description:
For now, custom artifact should satisfy the schema title regex (^[a-z][a-z0-9-_]{2,20}[.][A-Z][a-zA-Z0-9-_]{2,49}$).
tfx/tfx/orchestration/kubeflow/v2/compiler_utils.py
Lines 302 to 335 in 2fd4f95
| def get_artifact_schema(artifact_type: Type[artifact.Artifact]) -> str: | |
| """Gets the YAML schema string associated with the artifact type. | |
| Args: | |
| artifact_type: the artifact type that the schema is generated for. | |
| Returns: | |
| the encoded yaml schema definition for the artifact. | |
| Raises: | |
| ValueError if custom artifact type name does not adhere to KFP schema title. | |
| """ | |
| if artifact_type in _SUPPORTED_STANDARD_ARTIFACT_TYPES: | |
| # For supported first-party artifact types, get the built-in schema yaml per | |
| # its type name. | |
| schema_path = os.path.join( | |
| os.path.dirname(__file__), 'artifact_types', | |
| '{}.yaml'.format(artifact_type.TYPE_NAME)) | |
| return fileio.open(schema_path, 'rb').read() | |
| else: | |
| # Otherwise, fall back to the generic `Artifact` type schema. | |
| # To recover the Python type object at runtime, the artifact TYPE_NAME will | |
| # be encoded as the schema title. | |
| # Read the generic artifact schema template. | |
| if not _SCHEMA_TITLE_RE.fullmatch(artifact_type.TYPE_NAME): | |
| raise ValueError( | |
| f'Invalid custom artifact type name: {artifact_type.TYPE_NAME}') | |
| schema_path = os.path.join( | |
| os.path.dirname(__file__), 'artifact_types', 'Artifact.yaml') | |
| data = yaml.safe_load(fileio.open(schema_path, 'rb').read()) | |
| # Encode artifact TYPE_NAME. | |
| data['title'] = artifact_type.TYPE_NAME | |
| return yaml.dump(data, sort_keys=False) |
And custom artifact should be accessible with its title (top-level module) to be resolved in KubeFlowV2's container entry point.
tfx/tfx/orchestration/kubeflow/v2/container/kubeflow_v2_entrypoint_utils.py
Lines 219 to 234 in 2fd4f95
| def _retrieve_class_path(type_schema: pipeline_pb2.ArtifactTypeSchema) -> str: | |
| """Gets the class path from an artifact type schema.""" | |
| if type_schema.WhichOneof('kind') == 'schema_title': | |
| title = type_schema.schema_title | |
| if type_schema.WhichOneof('kind') == 'instance_schema': | |
| data = yaml.safe_load(type_schema.instance_schema) | |
| title = data.get('title', 'tfx.Artifact') | |
| if title in compiler_utils.TITLE_TO_CLASS_PATH: | |
| # For first party types, the actual import path is maintained in | |
| # TITLE_TO_CLASS_PATH map. | |
| return compiler_utils.TITLE_TO_CLASS_PATH[title] | |
| else: | |
| # For custom types, the import path is encoded as the schema title. | |
| return title |
But this information does not documented in the Artifact class. So it will be helpful for the developers who want to extend TFX with Vertex AI.