runpod · muhsinking · Aug 19, 2025 · Aug 20, 2025 · Aug 21, 2025 · Aug 21, 2025
diff --git a/docs.json b/docs.json
@@ -152,6 +152,7 @@
                 "pages": [
                   "pods/templates/overview",
                   "pods/templates/manage-templates",
+                  "pods/templates/create-custom-template",
                   "pods/templates/secrets"
                 ]
               },
@@ -206,7 +207,6 @@
               {
                 "group": "Troubleshooting",
                 "pages": [
-                  "references/troubleshooting/cuda-version-issue",
                   "references/troubleshooting/leaked-api-keys",
                   "references/troubleshooting/storage-full",
                   "references/troubleshooting/troubleshooting-502-errors",

diff --git a/pods/templates/create-custom-template.mdx b/pods/templates/create-custom-template.mdx
@@ -0,0 +1,359 @@
+---
+title: "Create a custom Pod template"
+sidebarTitle: "Create a custom template"
+description: "Learn how to extend official Runpod templates to create your own Pod templates."
+---
+
+This tutorial shows how to create custom Pod templates by extending official Runpod base images with additional Python dependencies and pre-baked ML models. You'll learn the complete workflow from Dockerfile creation to deployment and testing.
+
+Custom templates allow you to package your specific dependencies, models, and configurations into reusable Docker images that can be deployed as Pods. This approach saves time during Pod initialization and ensures consistent environments across deployments.
+
+## What you'll learn
+
+In this tutorial, you'll learn how to:
+
+- Create a Dockerfile that extends a Runpod base image.
+- Add Python dependencies to an existing base image.
+- Pre-package ML models into your custom template.
+- Build and push Docker images with the correct platform settings.
+- Deploy and test your custom template as a Pod.
+
+## Requirements
+
+Before you begin, you'll need:
+
+- A [Runpod account](/get-started/manage-accounts).
+- [Docker](https://www.docker.com/products/docker-desktop/) installed on your local machine.
+- A [Docker Hub](https://hub.docker.com/) account for hosting your custom images.
+- At least $5 in Runpod credits for testing.
+- Basic familiarity with Docker and command-line operations.
+
+## Step 1: Create a custom Dockerfile
+
+First, you'll create a Dockerfile that extends a Runpod base image with additional dependencies:
+
+1. Create a new directory for your custom template:
+
+```bash
+mkdir my-custom-template
+cd my-custom-template
+```
+
+2. Create a new Dockerfile:
+
+```bash
+touch Dockerfile
+```
+
+3. Open the Dockerfile in your preferred text editor and add the following content:
+
+```dockerfile
+# Use the specified base image
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
+
+# Install additional Python dependencies
+RUN pip install --no-cache-dir \
+    transformers \
+    accelerate
+
+# Don't specify a CMD here to use the base image's CMD
+```
+
+This extends the official Runpod PyTorch 2.8.0 base image, and installs two additional Python packages. This means that these packages will be automatically installed every time the Pod starts, so you won't need to run `pip install` again after Pod restarts.
+
+<Note>
+When building custom templates, always start with a Runpod base image that matches your CUDA requirements. The base image includes essential components like the `/start.sh` script that handles Pod initialization.
+</Note>
+
+To maintain access to packaged services (like JupyterLab and SSH over TCP), we avoid specifying a `CMD` or `ENTRYPOINT` in the Dockerfile. Runpod base images include a carefully configured startup script (`/start.sh`) that handles Pod initialization, SSH setup, and service startup. Overriding this can break Pod functionality.
+
+## Step 2: Add system dependencies
+
+If your application requires system-level packages, add them before the Python dependencies:
+
+```dockerfile
+# Use the specified base image
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
+
+# Update package list and install system dependencies
+RUN apt-get update && apt-get install -y \
+    git \
+    wget \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install additional Python dependencies
+RUN pip install --no-cache-dir \
+    transformers \
+    accelerate \
+    datasets \
+    torch-audio
+
+# Don't specify a CMD here to use the base image's CMD
+```
+
+<Tip>
+Always clean up package lists with `rm -rf /var/lib/apt/lists/*` after installing system packages to reduce image size.
+</Tip>
+
+## Step 3: Pre-bake ML models
+
+To reduce Pod setup overhead, you can pre-download models during the Pod initialization process. Here are two approaches:
+
+### Method 1: Simple model download script
+
+Create a Python script that downloads your model:
+
+1. Create a file named `download_model.py` in the same directory as your Dockerfile:
+
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+
+# Download and cache the model
+model_name = "microsoft/DialoGPT-medium"
+print(f"Downloading {model_name}...")
+
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+
+print("Model downloaded and cached successfully!")
+```
+
+2. Update your Dockerfile to include and run this script:
+
+```dockerfile
+# Use the specified base image
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
+
+# Install additional Python dependencies
+RUN pip install --no-cache-dir \
+    transformers \
+    accelerate
+
+# Copy and run model download script
+COPY download_model.py /tmp/download_model.py
+RUN python /tmp/download_model.py && rm /tmp/download_model.py
+
+# Don't specify a CMD here to use the base image's CMD
+```
+
+### Method 2: Using the Hugging Face CLI
+
+For more control over model downloads, use the Hugging Face CLI:
+
+```dockerfile
+# Use the specified base image
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
+
+# Install additional Python dependencies
+RUN pip install --no-cache-dir \
+    transformers \
+    accelerate \
+    huggingface_hub
+
+# Pre-download specific model files
+RUN python -c "from huggingface_hub import snapshot_download; snapshot_download('microsoft/DialoGPT-medium', cache_dir='/root/.cache/huggingface')"
+
+# Don't specify a CMD here to use the base image's CMD
+```
+
+<Warning>
+Pre-baking large models will significantly increase your Docker image size and build time. Consider whether the faster Pod startup time justifies the larger image size for your use case.
+</Warning>
+
+## Step 4: Build and push your Docker image
+
+Now you're ready to build your custom image and push it to Docker Hub:
+
+1. Build your Docker image with the correct platform specification:
+
+```bash
+docker build --platform=linux/amd64 -t my-custom-template:latest .
+```
+
+<Note>
+The `--platform=linux/amd64` flag is crucial for Runpod compatibility. Runpod's infrastructure requires AMD64 architecture images.
+</Note>
+
+2. Tag your image for Docker Hub (replace `YOUR_USERNAME` with your Docker Hub username):
+
+```bash
+docker tag my-custom-template:latest YOUR_USERNAME/my-custom-template:latest
+```
+
+3. Push the image to Docker Hub:
+
+```bash
+docker push YOUR_USERNAME/my-custom-template:latest
+```
+
+<Tip>
+If you haven't logged into Docker Hub from your command line, run `docker login` first and enter your Docker Hub credentials.
+</Tip>
+
+## Step 5: Create a Pod template in Runpod
+
+Next, create a Pod template using your custom Docker image:
+
+1. Navigate to the [Templates page](https://console.runpod.io/user/templates) in the Runpod console.
+2. Click **New Template**.
+3. Configure your template with these settings:
+   - **Name**: Give your template a descriptive name (e.g., "My Custom PyTorch Template").
+   - **Container Image**: Enter your Docker Hub image name (e.g., `YOUR_USERNAME/my-custom-template:latest`).
+   - **Container Disk**: Set to at least 20 GB to accommodate your custom dependencies.
+   - **Volume Disk**: Set according to your storage needs (e.g., 20 GB).
+   - **Volume Mount Path**: Keep the default `/workspace`.
+   - **Expose HTTP Ports**: Add `8888` for JupyterLab access.
+   - **Expose TCP Ports**: Add `22` if you need SSH access.
+4. Click **Save Template**.
+
+## Step 6: Deploy and test your custom template
+
+Now you're ready to deploy a Pod using your custom template to verify everything works correctly:
+
+1. Go to the [Pods page](https://console.runpod.io/pods) in the Runpod console.
+2. Click **Deploy**.
+3. Choose an appropriate GPU (make sure it meets the CUDA version requirements of your base image).
+4. Click **Change Template** and select your custom template under **Your Pod Templates**.
+5. Fill out the rest of the settings as desired, then click **Deploy On Demand**.
+6. Wait for your Pod to initialize (this may take 5-10 minutes for the first deployment).
+
+## Step 7: Verify your custom template
+
+Once your Pod is running, verify that your customizations work correctly:
+
+1. Find your Pod on the [Pods page](https://console.runpod.io/pods) and click on it to open the connection menu. Click Jupyter Lab under HTTP Services to open JupyterLab.
+2. Create a new Python notebook and test your pre-installed dependencies:
+
+```python
+# Test that your custom packages are installed
+import transformers
+import accelerate
+print(f"Transformers version: {transformers.__version__}")
+print(f"Accelerate version: {accelerate.__version__}")
+
+# If you pre-baked a model, test loading it
+from transformers import AutoTokenizer, AutoModelForCausalLM
+
+model_name = "microsoft/DialoGPT-medium"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+
+print("Model loaded successfully!")
+```
+
+3. Run the cell to confirm everything is working as expected.
+
+## Advanced customization options
+
+### Setting environment variables
+
+You can set default environment variables in your template configuration:
+
+1. In the template creation form, scroll to **Environment Variables**.
+2. Add key-value pairs for any environment variables your application needs:
+   - Key: `HUGGINGFACE_HUB_CACHE`
+   - Value: `/workspace/hf_cache`
+
+### Adding startup scripts
+
+To run custom initialization code when your Pod starts, create a startup script:
+
+1. Create a `startup.sh` file in your project directory:
+
+```bash
+#!/bin/bash
+echo "Running custom startup script..."
+mkdir -p /workspace/models
+echo "Custom startup complete!"
+```
+
+2. Add it to your Dockerfile:
+
+```dockerfile
+# Copy startup script
+COPY startup.sh /usr/local/bin/startup.sh
+RUN chmod +x /usr/local/bin/startup.sh
+
+# Modify the start script to run our custom startup
+RUN echo '/usr/local/bin/startup.sh' >> /start.sh
+```
+
+### Using multi-stage builds
+
+For complex applications, use multi-stage builds to reduce final image size:
+
+```dockerfile
+# Build stage
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 as builder
+
+# Install build dependencies
+RUN apt-get update && apt-get install -y build-essential
+RUN pip install --no-cache-dir some-package-that-needs-compilation
+
+# Final stage
+FROM runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04
+
+# Copy only the necessary files from builder
+COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
+
+# Install runtime dependencies
+RUN pip install --no-cache-dir transformers accelerate
+```
+
+## Troubleshooting
+
+Here are solutions to common issues when creating custom templates:
+
+### Build failures
+
+- **Platform mismatch**: Always use `--platform=linux/amd64` when building.
+- **Base image not found**: Verify the base image tag exists on Docker Hub.
+- **Package installation fails**: Check that package names are correct and available for the Python version in your base image.
+
+### Pod deployment issues
+
+- **Pod fails to start**: Check the Pod logs in the Runpod console for error messages.
+- **Services not accessible**: Ensure you've exposed the correct ports in your template configuration.
+- **CUDA version mismatch**: Make sure your base image CUDA version is compatible with your chosen GPU.
+
+### Performance issues
+
+- **Slow startup**: Consider pre-baking more dependencies or using a smaller base image.
+- **Out of memory**: Increase container disk size or choose a GPU with more VRAM.
+- **Model loading errors**: Verify that pre-baked models are in the expected cache directories.
+
+## Best practices
+
+Follow these best practices when creating custom templates:
+
+### Image optimization
+
+- Use `.dockerignore` to exclude unnecessary files from your build context.
+- Combine RUN commands to reduce image layers.
+- Clean up package caches and temporary files.
+- Use specific version tags for dependencies to ensure reproducibility.
+
+### Security considerations
+
+- Don't include sensitive information like API keys in your Docker image.
+- Use [Runpod Secrets](/pods/templates/secrets) for sensitive configuration.
+- Regularly update base images to get security patches.
+
+### Version management
+
+- Tag your images with version numbers (e.g., `v1.0.0`) instead of just `latest`.
+- Keep a changelog of what changes between versions.
+- Test new versions thoroughly before updating production templates.
+
+## Next steps
+
+Now that you have a working custom template, consider these next steps:
+
+- **Automate builds**: Set up GitHub Actions or similar CI/CD to automatically build and push new versions of your template.
+- **Share with team**: If you're using a team account, share your template with team members.
+- **Create variations**: Build specialized versions of your template for different use cases (development vs. production).
+- **Monitor usage**: Track how your custom templates perform in production and optimize accordingly.
+
+For more advanced template management, see the [Template Management API documentation](/api-reference/templates/POST/templates) to programmatically create and update templates.