Stand up a Poolside deployment on AWS in two profiles: platform-only (no GPU) or full (platform + local GPU inference). Pick a profile, copy its example root out of this repo, fill in your variables, and apply.
For background on what gets created, see architecture.md. For the prerequisite list (tools, AWS credentials, bundle, model checkpoints), see prerequisites.md. For the staged-rollout walkthrough and verification commands, see deployment-guide.md.
| Component | platform-only |
full |
|---|---|---|
| VPC, EKS cluster, CPU node group | ✅ | ✅ |
| RDS PostgreSQL, S3 buckets, ECR | ✅ | ✅ |
| ALB controller, External Secrets Operator | ✅ | ✅ |
poolside-deployment Helm release |
✅ | ✅ |
| GPU node group + NVIDIA GPU Operator | ❌ | ✅ |
| Models S3 bucket + checkpoint upload | ❌ | ✅ |
inference-stack Helm release |
❌ | ✅ |
| Cognito user pool | ☑️ optional | ☑️ optional |
| Profile | When to use |
|---|---|
platform-only |
You connect Poolside to an external OpenAI-compatible model API (Anthropic, OpenAI, a Bedrock shim, internally hosted inference). Approximately $3/hr. |
full |
You want everything in one cluster and have GPU quota in your AWS account. Approximately $100/hr at the recommended GPU defaults. |
Both profiles follow the same six steps. The full profile has additional inputs for the GPU node group and model checkpoint uploads.
-
Confirm prerequisites. AWS credentials with admin-equivalent access on the target account, the required CLI tools, an extracted Poolside Helm bundle, model checkpoint tarballs (full profile only), a public DNS hostname you've chosen, and an ACM certificate covering that hostname issued in your target region. See prerequisites.md for the full list and a sanity checklist.
-
Copy the example root outside this repo:
# Full profile cp -r path/to/aws-reference-architecture/public/examples/full ~/my-poolside-deployment # Or platform-only cp -r path/to/aws-reference-architecture/public/examples/platform-only ~/my-poolside-deployment cd ~/my-poolside-deployment
-
Fill in
terraform.tfvarsfrom the supplied example. Required for both profiles:Variable Notes deployment_nameLowercase prefix used for every AWS resource regionAWS region cluster_endpoint_public_access_cidrsCIDR blocks allowed to reach the EKS public API endpoint admin_principal_arnsIAM principals granted EKS cluster-admin via Access Entries public_hostnamePublic DNS hostname; an ACM cert covering it must already exist in regioncontainers_dirPath to the bundle's containers/directorybundle_rootPath to the extracted bundle (parent of charts/andcontainers/)Additionally required for the
fullprofile:Variable Notes checkpoints_dirDirectory of *.tarmodel checkpoint archivesOptional for both profiles:
enable_cognito = trueto have Terraform create a Cognito user pool instead of bringing your own OIDC. -
Export your AWS profile in the shell:
export AWS_PROFILE=<your-admin-profile> aws sts get-caller-identity
-
Init, plan, apply. The first apply provisions infrastructure, pushes container images to ECR, and (full profile) uploads model checkpoints. The Helm releases are gated off:
terraform init terraform plan terraform apply
-
Flip on the Helm installs once you've verified each layer. Set
install_poolside_deployment = true(andinstall_inference_stack = truefor thefullprofile) interraform.tfvars, then re-apply. See deployment-guide.md for the full staged walkthrough and verification commands.
- Get the ALB hostname from the ingress resource:
kubectl get ingress -n poolside
- Point your
public_hostnameat the ALB in your DNS provider. Most providers support aCNAMEfrom your hostname to the ALB DNS name. If you host the zone in Route 53, use an aliasArecord (type = A, alias target = the ALB) instead — Route 53 aliases also work at the zone apex, where CNAMEs aren't valid. - Bind your identity provider by visiting
https://<your-hostname>and following the on-screen prompts. With Cognito, retrieve the issuer URL, client ID, and client secret from the Terraform outputs:terraform output -raw cognito_user_pool_endpoint terraform output -raw cognito_user_pool_client_id terraform output -raw cognito_user_pool_client_secret
- The minimum supported GPU instance type is
p5e.48xlarge. Smaller GPUs do not have enough memory for the shipped models. - Python 3 with
boto3importable is required for model checkpoint uploads in thefullprofile. If that's a non-starter, see the BYO-bucket mode in model-checkpoints.md. - External OIDC values are entered in the Poolside Console during first-time setup. Cognito is optional; when enabled, Terraform outputs the values needed to bind.