-
Notifications
You must be signed in to change notification settings - Fork 158
Add serve deploy and quickstart guides. #754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gh0st-ryder
wants to merge
1
commit into
NVIDIA:main
Choose a base branch
from
gh0st-ryder:add-markdowns
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| # Lepton.AI deployment | ||
|
|
||
| ## Using the Lepton.AI Dashboard | ||
|
|
||
| We will use the Lepton.AI dashboard to start the inference service. | ||
| Please refer to your onboarding instructions to get access to this dashboard. | ||
|
|
||
| The dashboard has an `Endpoints` tab on the top. | ||
| This is used to deploy long running services such as inference. | ||
|
|
||
| * Click on the `Endpoints` tab, then click on the `Create Endpoint` button on the right hand side. | ||
| * Choose the `Create from Container Image` option. | ||
| * Set an appropriate Endpoint name. | ||
| * Resource: | ||
| * Choose the GPU option. Currently we only support x1 GPU, but this will change in the future. | ||
| * Choose any preemption policy. | ||
| * Image Configuration: | ||
| * Set your custom docker image, or use one of the prebuilt tags as appropriate. | ||
| * Set server port to 8000 for the inference container, and 8888 for the jupyter container. | ||
| * A registry auth might need to be created to access a private registry. If so, supply it here. | ||
| * For the custom command, refer to the [Custom Command](#custom-command) section. | ||
| * Access Tokens: | ||
| * If required, we can create a new access token for authorization. | ||
| * If one is created, then it will need to be supplied while calling the REST APIs using the header | ||
| `-H "Authorization: Bearer ${TOKEN}"`. | ||
| * Environment variables and secrets can be provided if necessary (e.g. WANDB_API_KEY). | ||
| * Storage: | ||
| * The inference container expects a mount for `/outputs`. Set this in the `Mount Path`. | ||
| * During onboarding, your project is provided with some NFS storage at a certain path. | ||
| You can provide a sub-directory within this path in the `From path`. | ||
| * Volume should be `lepton-shared-fs` or `amlfs`. | ||
| * Click `Create` to create this endpoint. Choose 1 replica. | ||
|
|
||
| Once the endpoint scales and is ready, you can start sending REST API requests to it. | ||
|
|
||
| ### Custom Command | ||
|
|
||
| The docker image as built from the default Dockerfile comes preset with the command to run the | ||
| service. | ||
| If the default settings in `serve/server/conf/config.yaml` are fine, then you can leave this | ||
| section below blank. | ||
| If you wish to override certain settings with env vars or have some custom setup of your own, | ||
| then provide those here. | ||
|
|
||
| ```bash | ||
| #!/bin/bash | ||
| <Additional custom setup if needed> | ||
| ``` | ||
|
|
||
| ## Debugging and logs | ||
|
|
||
| We can click on the Endpoint -> Replicas to bring up some additional options. | ||
|
|
||
| * Clicking on `API` brings up an option to run the various REST APIs. | ||
| For e.g. health check, or list inference requests, etc. | ||
| * Clicking on `Terminal` for the specific replica opens a Terminal into the container. | ||
| * Click on `Logs` shows a live stream of the current logs (slightly delayed). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,138 @@ | ||
| # Quickstart guide | ||
|
|
||
| ## Developer quickstart | ||
|
|
||
| Developers who have Earth2Studio installed on a gpu-enabled system can easily get started with the | ||
| inference platform as follows. | ||
| For developers who prefer to test using a container with requirements pre-installed, | ||
| please refer to the section [Container Builds](#container-builds) below. | ||
|
|
||
| * Install redis | ||
|
|
||
| ```bash | ||
| apt update && apt install redis | ||
| ``` | ||
|
|
||
| * Install requirements for the inference server | ||
|
|
||
| ```bash | ||
| cd server | ||
| pip install -r requirements.txt | ||
| ``` | ||
|
|
||
| * The default Dockerfile CMD starts up the inference server. | ||
|
|
||
| * Check health | ||
|
|
||
| ```bash | ||
| curl localhost:8000/health | ||
| ``` | ||
|
|
||
| ### Creating and testing a custom workflow locally | ||
|
|
||
| * Use the Earth2Workflow base class to develop the inference workflows. | ||
| Examples are shown in the files: server/example_workflows/deterministic_earth2_workflow.py. | ||
|
|
||
| An example of a locally tested custom_workflow is shown below. | ||
|
|
||
| ```python | ||
| """ | ||
| Deterministic Workflow Custom Pipeline | ||
|
|
||
| This pipeline implements the deterministic workflow from examples/01_deterministic_workflow.py | ||
| as a custom pipeline that can be invoked via the REST API. | ||
| """ | ||
|
|
||
| from datetime import datetime | ||
| from typing import Literal | ||
|
|
||
| from earth2studio import run | ||
| from earth2studio.data import GFS | ||
| from earth2studio.io import IOBackend | ||
| from earth2studio.models.px import DLWP, FCN | ||
| from earth2studio.serve.server import Earth2Workflow, workflow_registry | ||
|
|
||
|
|
||
| @workflow_registry.register | ||
| class DeterministicEarth2Workflow(Earth2Workflow): | ||
| """ | ||
| Deterministic workflow with auto-registration | ||
| """ | ||
|
|
||
| name = "deterministic_earth2_workflow" | ||
| description = "Deterministic workflow with auto-registration" | ||
|
|
||
| def __init__(self, model_type: Literal["fcn", "dlwp"] = "fcn"): | ||
| super().__init__() | ||
|
|
||
| if model_type == "fcn": | ||
| package = FCN.load_default_package() | ||
| self.model = FCN.load_model(package) | ||
| elif model_type == "dlwp": | ||
| package = DLWP.load_default_package() | ||
| self.model = DLWP.load_model(package) | ||
| else: | ||
| raise ValueError(f"Unsupported model type: {model_type}") | ||
|
|
||
| self.data = GFS() | ||
|
|
||
| def __call__( | ||
| self, | ||
| io: IOBackend, | ||
| start_time: list[datetime] = [datetime(2024, 1, 1, 0)], | ||
| num_steps: int = 20, | ||
| ): | ||
| """Run the deterministic workflow pipeline""" | ||
|
|
||
| run.deterministic(start_time, num_steps, self.model, self.data, io) | ||
|
|
||
| print("initializing ") | ||
| model = DeterministicEarth2Workflow() | ||
| print("calling model") | ||
| from earth2studio.io import ZarrBackend | ||
| io = ZarrBackend() | ||
| model(io) | ||
| ``` | ||
|
|
||
| It is run as follows without needing to start redis etc. | ||
|
|
||
| ```bash | ||
| python serve/server/example_workflows/custom_workflow.py | ||
| ``` | ||
|
|
||
| * Refer to these READMEs [Earth2Workflow](./server/README_earth2workflows.md), | ||
| [Workflow](./server/README_workflows.md) | ||
|
|
||
| ## Container builds | ||
|
|
||
| The Earth2Studio parent directory contains Dockerfiles that let you build the inference service | ||
| for deployment onto Lepton.AI. | ||
|
|
||
| ### Inference Container | ||
|
|
||
| The inference container can be built from the [Dockerfile](./Dockerfile). | ||
|
|
||
| Alternatively, the prebuilt container images can be used from the | ||
| [NGC registry][ngc-registry] after onboarding. | ||
|
|
||
| <!-- markdownlint-disable-next-line MD013 --> | ||
| [ngc-registry]: https://registry.ngc.nvidia.com/orgs/dycvht5ows21/containers/earth2studio-scicomp/tags | ||
|
|
||
| ## Lepton.AI onboarding | ||
|
|
||
| Please talk to your NVIDIA contact or TAM to get onboarded onto the Lepton.AI cluster. | ||
|
|
||
| ## Lepton.AI deployment | ||
|
|
||
| Please see the [deployment guide](DEPLOY.md) for instructions on how to set up the inference | ||
| service on your Lepton.AI endpoint. | ||
|
|
||
| ## Using the inference service | ||
|
|
||
| Once you set up your inference endpoint, you may either call the services directly through REST | ||
| APIs or you may use the client SDK. | ||
|
|
||
| ## Writing custom inference workflows | ||
|
|
||
| You may port more [predefined examples](../examples) or write your own custom workflows using the | ||
| [custom workflows](server/README_workflows.md) guide. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mutable default argument in documentation example
The
start_timeparameter uses a mutable default argument ([datetime(2024, 1, 1, 0)]), which is a well-known Python pitfall — the list is shared across all calls that don't provide this argument. This same pattern exists in the source fileserver/example_workflows/deterministic_earth2_workflow.py, so it's a pre-existing issue, but since this README is meant to guide new users, it may be worth using an immutable default (e.g.,Nonewith a check inside the method body) to avoid teaching the anti-pattern.Additionally, the
num_stepsdefault here is20, while the actual source file uses6. If the intent is to mirror the source file, consider keeping these in sync to avoid confusion.