DeepFabric Cloud (dfcloud)

Run DeepFabric jobs on Google Cloud with Slack notifications.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      Google Cloud                            │
│                                                              │
│  ┌─────────────────┐         ┌─────────────────────────┐    │
│  │  Cloud Run      │  HTTP   │  Cloud Run Service      │    │
│  │  Job            │────────▶│  (Spin + mock data)     │    │
│  │  (deepfabric)   │         │  always-on              │    │
│  └────────┬────────┘         └─────────────────────────┘    │
│           │                                                  │
│           │ read config / write output                       │
│           ▼                                                  │
│  ┌─────────────────┐                                        │
│  │  Cloud Storage  │                                        │
│  │  (GCS bucket)   │                                        │
│  └─────────────────┘                                        │
└───────────┬─────────────────────────────────────────────────┘
            │
            │ on completion
            ▼
   ┌─────────────────┐
   │  Slack Webhook  │
   └─────────────────┘

Prerequisites

Google Cloud account with billing enabled
gcloud CLI installed and authenticated
Terraform >= 1.0
Docker for building container images
Python >= 3.10 for the CLI

Quick Start

1. Set up Slack Webhook

Go to api.slack.com/apps
Create a new app → "From scratch"
Add feature → "Incoming Webhooks" → Enable
Click "Add New Webhook to Workspace" → Select your channel
Copy the webhook URL

2. Configure Terraform

cd infra

# Copy and edit the example config
cp terraform.tfvars.example terraform.tfvars

# Edit terraform.tfvars with your values:
# - project_id
# - spin_image (we'll create this next)
# - deepfabric_image (we'll create this next)
# - slack_webhook_url

3. Build and Push Container Images

First, deploy infrastructure to create the Artifact Registry:

cd infra

# Initialize and apply just the registry first
terraform init
terraform apply -target=google_artifact_registry_repository.dfcloud -target=google_project_service.apis

Get the registry URL:

terraform output artifact_registry
# Output: us-central1-docker.pkg.dev/your-project/dfcloud

Build and push the DeepFabric job image:

cd ../deepfabric-job

# Configure Docker for Artifact Registry
gcloud auth configure-docker us-central1-docker.pkg.dev

# Build and push
REGISTRY=$(cd ../infra && terraform output -raw artifact_registry)
docker build -t $REGISTRY/deepfabric-job:latest .
docker push $REGISTRY/deepfabric-job:latest

Build and push your Spin service image (assuming you have it):

# From your spin tools-sdk directory
docker build -t $REGISTRY/spin:latest .
docker push $REGISTRY/spin:latest

4. Deploy Infrastructure

cd infra

# Update terraform.tfvars with the image paths
# Then apply all resources
terraform apply

5. Install the CLI

cd cli
pip install -e .

# Initialize configuration
dfcloud config init
# Enter: project_id, region, bucket name

6. Load Mock Data into Spin (if needed)

If your Spin service needs mock data loaded:

# Get the Spin service URL
cd infra
terraform output spin_service_url

# Load mock data (adjust for your setup)
curl -X POST "$(terraform output -raw spin_service_url)/mock/load" \
  -H "Content-Type: application/json" \
  -d @path/to/mock-data.json

Usage

Submit a Job

# Submit a job from a config file
dfcloud submit spin-dataforseo-5x5.yaml --name seo-dataset-v1

# Submit and wait for completion
dfcloud submit spin-dataforseo.yaml --name my-job --wait

# With custom timeout (in seconds)
dfcloud submit config.yaml --timeout 7200

Check Job Status

# Check latest execution
dfcloud status

# Check specific execution
dfcloud status abc123-def456

View Logs

# View logs for latest execution
dfcloud logs

# View logs for specific execution
dfcloud logs abc123-def456

# Follow logs in real-time
dfcloud logs -f

List Executions

# List recent executions
dfcloud list

# Show more executions
dfcloud list --limit 20

Download Outputs

# List available outputs
dfcloud outputs

# Download outputs for a job
dfcloud download seo-dataset-v1

# Download to specific directory
dfcloud download seo-dataset-v1 --output ./my-outputs

Configuration

# Initialize config interactively
dfcloud config init

# Set individual values
dfcloud config set project_id my-project
dfcloud config set region us-central1
dfcloud config set bucket my-bucket-dfcloud

# List current config
dfcloud config list

# Get a specific value
dfcloud config get project_id

Configuration File Format

Your DeepFabric YAML configs work as-is. The job runner automatically:

Downloads the config from GCS
Updates spin_endpoint to point to the Cloud Run Spin service
Updates tools_endpoint if it references localhost
Runs deepfabric generate
Uploads outputs to GCS
Sends Slack notification

Example config structure:

topics:
  prompt: "Tasks for an SEO assistant..."
  mode: graph
  depth: 5
  degree: 5
  save_as: "topic-graph.jsonl"  # Will be uploaded to GCS
  llm:
    provider: "gemini"
    model: "gemini-2.5-flash"

generation:
  tools:
    spin_endpoint: "http://localhost:3000"  # Auto-updated to Cloud Run URL
    tools_endpoint: "http://localhost:3000/mock/list-tools"
  # ... rest of config

output:
  num_samples: 750
  save_as: "dataset.jsonl"  # Will be uploaded to GCS

Outputs

Job outputs are stored in GCS at:

gs://{bucket}/outputs/{job-name}/{timestamp}/
├── topic-graph.jsonl
└── dataset.jsonl

Slack Notifications

You'll receive notifications for:

Job completed: Shows duration and output file locations
Job failed: Shows error message and duration

Example notification:

✅ Job Completed: seo-dataset-v1

Status: Succeeded
Duration: 45.2 minutes

Outputs:
• gs://my-bucket/outputs/seo-dataset-v1/20240110-143022/topic-graph.jsonl
• gs://my-bucket/outputs/seo-dataset-v1/20240110-143022/dataset.jsonl

Cost Optimization

Spin service: Set spin_min_instances = 0 for scale-to-zero (adds cold start latency)
Job resources: Adjust deepfabric_job_memory and deepfabric_job_cpu based on your workloads
Storage: Outputs older than 90 days are automatically moved to Nearline storage

Troubleshooting

Job fails immediately

Check that:

The Spin service is running: gcloud run services describe spin-service --region us-central1
Config file exists in GCS: gsutil ls gs://your-bucket/configs/
Service account has correct permissions

Can't connect to Spin

The DeepFabric job needs the run.invoker role on the Spin service. Verify:

gcloud run services get-iam-policy spin-service --region us-central1

Logs show authentication errors

Ensure your local gcloud is authenticated:

gcloud auth login
gcloud auth application-default login

Cleanup

To destroy all resources:

cd infra
terraform destroy

Note: This will delete the GCS bucket and all outputs. Back up any important data first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepFabric Cloud (dfcloud)

Architecture

Prerequisites

Quick Start

1. Set up Slack Webhook

2. Configure Terraform

3. Build and Push Container Images

4. Deploy Infrastructure

5. Install the CLI

6. Load Mock Data into Spin (if needed)

Usage

Submit a Job

Check Job Status

View Logs

List Executions

Download Outputs

Configuration

Configuration File Format

Outputs

Slack Notifications

Cost Optimization

Troubleshooting

Job fails immediately

Can't connect to Spin

Logs show authentication errors

Cleanup

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

DeepFabric Cloud (dfcloud)

Architecture

Prerequisites

Quick Start

1. Set up Slack Webhook

2. Configure Terraform

3. Build and Push Container Images

4. Deploy Infrastructure

5. Install the CLI

6. Load Mock Data into Spin (if needed)

Usage

Submit a Job

Check Job Status

View Logs

List Executions

Download Outputs

Configuration

Configuration File Format

Outputs

Slack Notifications

Cost Optimization

Troubleshooting

Job fails immediately

Can't connect to Spin

Logs show authentication errors

Cleanup