Run DeepFabric jobs on Google Cloud with Slack notifications.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Google Cloud β
β β
β βββββββββββββββββββ βββββββββββββββββββββββββββ β
β β Cloud Run β HTTP β Cloud Run Service β β
β β Job ββββββββββΆβ (Spin + mock data) β β
β β (deepfabric) β β always-on β β
β ββββββββββ¬βββββββββ βββββββββββββββββββββββββββ β
β β β
β β read config / write output β
β βΌ β
β βββββββββββββββββββ β
β β Cloud Storage β β
β β (GCS bucket) β β
β βββββββββββββββββββ β
βββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββββ
β
β on completion
βΌ
βββββββββββββββββββ
β Slack Webhook β
βββββββββββββββββββ
- Google Cloud account with billing enabled
- gcloud CLI installed and authenticated
- Terraform >= 1.0
- Docker for building container images
- Python >= 3.10 for the CLI
- Go to api.slack.com/apps
- Create a new app β "From scratch"
- Add feature β "Incoming Webhooks" β Enable
- Click "Add New Webhook to Workspace" β Select your channel
- Copy the webhook URL
cd infra
# Copy and edit the example config
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values:
# - project_id
# - spin_image (we'll create this next)
# - deepfabric_image (we'll create this next)
# - slack_webhook_urlFirst, deploy infrastructure to create the Artifact Registry:
cd infra
# Initialize and apply just the registry first
terraform init
terraform apply -target=google_artifact_registry_repository.dfcloud -target=google_project_service.apisGet the registry URL:
terraform output artifact_registry
# Output: us-central1-docker.pkg.dev/your-project/dfcloudBuild and push the DeepFabric job image:
cd ../deepfabric-job
# Configure Docker for Artifact Registry
gcloud auth configure-docker us-central1-docker.pkg.dev
# Build and push
REGISTRY=$(cd ../infra && terraform output -raw artifact_registry)
docker build -t $REGISTRY/deepfabric-job:latest .
docker push $REGISTRY/deepfabric-job:latestBuild and push your Spin service image (assuming you have it):
# From your spin tools-sdk directory
docker build -t $REGISTRY/spin:latest .
docker push $REGISTRY/spin:latestcd infra
# Update terraform.tfvars with the image paths
# Then apply all resources
terraform applycd cli
pip install -e .
# Initialize configuration
dfcloud config init
# Enter: project_id, region, bucket nameIf your Spin service needs mock data loaded:
# Get the Spin service URL
cd infra
terraform output spin_service_url
# Load mock data (adjust for your setup)
curl -X POST "$(terraform output -raw spin_service_url)/mock/load" \
-H "Content-Type: application/json" \
-d @path/to/mock-data.json# Submit a job from a config file
dfcloud submit spin-dataforseo-5x5.yaml --name seo-dataset-v1
# Submit and wait for completion
dfcloud submit spin-dataforseo.yaml --name my-job --wait
# With custom timeout (in seconds)
dfcloud submit config.yaml --timeout 7200# Check latest execution
dfcloud status
# Check specific execution
dfcloud status abc123-def456# View logs for latest execution
dfcloud logs
# View logs for specific execution
dfcloud logs abc123-def456
# Follow logs in real-time
dfcloud logs -f# List recent executions
dfcloud list
# Show more executions
dfcloud list --limit 20# List available outputs
dfcloud outputs
# Download outputs for a job
dfcloud download seo-dataset-v1
# Download to specific directory
dfcloud download seo-dataset-v1 --output ./my-outputs# Initialize config interactively
dfcloud config init
# Set individual values
dfcloud config set project_id my-project
dfcloud config set region us-central1
dfcloud config set bucket my-bucket-dfcloud
# List current config
dfcloud config list
# Get a specific value
dfcloud config get project_idYour DeepFabric YAML configs work as-is. The job runner automatically:
- Downloads the config from GCS
- Updates
spin_endpointto point to the Cloud Run Spin service - Updates
tools_endpointif it references localhost - Runs
deepfabric generate - Uploads outputs to GCS
- Sends Slack notification
Example config structure:
topics:
prompt: "Tasks for an SEO assistant..."
mode: graph
depth: 5
degree: 5
save_as: "topic-graph.jsonl" # Will be uploaded to GCS
llm:
provider: "gemini"
model: "gemini-2.5-flash"
generation:
tools:
spin_endpoint: "http://localhost:3000" # Auto-updated to Cloud Run URL
tools_endpoint: "http://localhost:3000/mock/list-tools"
# ... rest of config
output:
num_samples: 750
save_as: "dataset.jsonl" # Will be uploaded to GCSJob outputs are stored in GCS at:
gs://{bucket}/outputs/{job-name}/{timestamp}/
βββ topic-graph.jsonl
βββ dataset.jsonl
You'll receive notifications for:
- Job completed: Shows duration and output file locations
- Job failed: Shows error message and duration
Example notification:
β
Job Completed: seo-dataset-v1
Status: Succeeded
Duration: 45.2 minutes
Outputs:
β’ gs://my-bucket/outputs/seo-dataset-v1/20240110-143022/topic-graph.jsonl
β’ gs://my-bucket/outputs/seo-dataset-v1/20240110-143022/dataset.jsonl
- Spin service: Set
spin_min_instances = 0for scale-to-zero (adds cold start latency) - Job resources: Adjust
deepfabric_job_memoryanddeepfabric_job_cpubased on your workloads - Storage: Outputs older than 90 days are automatically moved to Nearline storage
Check that:
- The Spin service is running:
gcloud run services describe spin-service --region us-central1 - Config file exists in GCS:
gsutil ls gs://your-bucket/configs/ - Service account has correct permissions
The DeepFabric job needs the run.invoker role on the Spin service. Verify:
gcloud run services get-iam-policy spin-service --region us-central1Ensure your local gcloud is authenticated:
gcloud auth login
gcloud auth application-default loginTo destroy all resources:
cd infra
terraform destroyNote: This will delete the GCS bucket and all outputs. Back up any important data first.