-
Notifications
You must be signed in to change notification settings - Fork 137
Description
Description
As part of the future experiment tracking we want to be able to have the application know it's own identity. When we launch a job we return the full job id (i.e. kubernetes://session/app_id
) but the app itself doesn't have this exact same job ID. We do provide an app_id
macro that can be used in the app def for both env and arguments but it's up to the app owner to manually add that.
Motivation/Background
If we add a TORCHX_JOB_ID
environment variable it allows us to write more standardized integrations for experiment tracking that use the job ID as a key. There's no added cost from an extra environment variable and will enable deeper automatic integrations into other libraries.
Detailed Proposal
Add a new environment variable to Runner.dryrun
https://github.com/pytorch/torchx/blob/main/torchx/runner/api.py#L241
that uses the macros.app_id to add the full job ID using the scheduler and session information form the runner.
https://github.com/pytorch/torchx/blob/main/torchx/specs/api.py#L156