This repository contains 12 agent skills for the Together AI platform. Each skill is a self-contained directory following the Agent Skills specification.
- **together-chat-completions**: Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-turn conversations, tool and function calling, structured JSON outputs, and reasoning models. Reach for it whenever the user wants to build or debug text generation on Together AI, unless they specifically need batch jobs, embeddings, fine-tuning, dedicated endpoints, dedicated containers, or GPU clusters. - **together-images**: Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, reference-image guidance, and local image downloads. Reach for it whenever the user wants to generate or edit images on Together AI rather than create videos or build text-only chat applications. - **together-video**: Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selection, asynchronous job polling, and video downloads. Reach for it whenever the user wants motion generation on Together AI rather than still-image generation or text-only inference. - **together-audio**: Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transcription, translation, diarization, timestamps, and live STT. Reach for it whenever the user needs audio in or audio out on Together AI rather than chat generation, image or video creation, or model training. - **together-embeddings**: Dense vector embeddings, semantic search, RAG pipelines, and reranking via Together AI. Generate embeddings with open-source models and rerank results behind dedicated endpoints. Reach for it whenever the user needs vector representations or retrieval quality improvements rather than direct text generation. - **together-fine-tuning**: LoRA, full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uploads on Together AI. Reach for it whenever the user wants to adapt a model on custom data rather than only run inference, evaluate outputs, or host an existing model. - **together-batch-inference**: High-volume, asynchronous offline inference at up to 50% lower cost via Together AI's Batch API. Prepare JSONL inputs, upload files, create jobs, poll status, and download outputs. Reach for it whenever the user needs non-interactive bulk inference rather than real-time chat or evaluation jobs. - **together-evaluations**: LLM-as-a-judge evaluation framework on Together AI. Classify, score, and compare model outputs, select judge models, use external-provider judges or targets, poll results and download reports. Reach for it whenever the user wants to benchmark outputs, grade responses, compare A/B variants, or operationalize automated evaluations. - **together-sandboxes**: Remote Python execution in managed sandboxes on Together AI with stateful sessions, file uploads, data analysis, chart generation, and notebook-like runs via the Sandboxes API. Reach for it whenever the user wants managed remote Python execution instead of local execution, raw clusters, or full model hosting. - **together-dedicated-endpoints**: Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Deploy fine-tuned or uploaded models, size hardware, and manage endpoint lifecycle. Reach for it whenever the user needs predictable always-on hosting rather than serverless inference, custom containers, or raw clusters. - **together-dedicated-containers**: Custom Dockerized inference workers on Together AI's managed GPU infrastructure. Build with Sprocket SDK, configure with Jig CLI, submit async queue jobs, and poll results. Reach for it whenever the user needs container-level control rather than a standard model endpoint or raw cluster. - **together-gpu-clusters**: On-demand and reserved GPU clusters (H100, H200, B200) on Together AI with Kubernetes or Slurm orchestration, shared storage, credential management, and cluster scaling for ML and HPC jobs. Reach for it when the user needs multi-node compute or infrastructure control rather than a managed model endpoint.togetherai-skills/
├── AGENTS.md # This file — agent instructions
├── README.md # Human-facing docs
├── LICENSE # MIT
├── quality/
│ └── trigger-evals/ # Skill trigger eval sets
├── scripts/ # Repo tooling and generators
└── skills/
└── together-<product>/ # One directory per skill
├── SKILL.md # Required — frontmatter + instructions
├── agents/
│ └── openai.yaml # Optional — UI metadata for OpenAI/Codex surfaces
├── references/ # Optional — detailed reference docs
│ ├── models.md
│ ├── api-reference.md
│ └── ...
└── scripts/ # Optional — runnable Python examples
└── <workflow>.py
Every skill must have a SKILL.md with YAML frontmatter and a Markdown body:
---
name: together-<product>
description: "One-line description, no angle brackets, max 1024 chars"
---Required frontmatter fields: name, description.
Optional frontmatter fields: license, allowed-tools, metadata, compatibility.
Rules:
namemust be kebab-case, max 64 charactersdescriptionmust NOT contain angle brackets (<or>)- Body should stay lean; target under 500 lines and move deep detail into
references/
Every skill in this repo includes agents/openai.yaml with:
display_nameshort_descriptiondefault_prompt
The default prompt must explicitly mention the skill as $skill-name.
Markdown files in references/ are loaded on demand when the agent needs deeper detail. Use these for model lists, full API specs, CLI command references, and data format documentation.
For reference files over ~100 lines, include a short ## Contents section near the top so agents can route quickly.
Python files in scripts/ are runnable examples demonstrating complete workflows. All scripts in this repo use the Together Python v2 SDK (together>=2.0.0).
- Target Python 3.10+
- Use
togetherv2 SDK with keyword-only arguments - Every script must have a module docstring with: description, usage command, and requirements
- Include
if __name__ == "__main__":block with working examples - Use type hints (
list[str],str | None) - Initialize client at module level:
client = Together() - Assume
TOGETHER_API_KEYis set as an environment variable - Prefer reusable CLIs over hard-coded one-off demos for multi-step or billable workflows
- No third-party dependencies beyond
togetherunless absolutely necessary (note it in the docstring if so)
These are the correct v2 SDK method names. Do NOT use v1 patterns:
| Operation | v2 (correct) | v1 (wrong) |
|---|---|---|
| Create batch | client.batches.create() |
client.create_batch() |
| Get batch | client.batches.retrieve() |
client.get_batch() |
| Get endpoint | client.endpoints.retrieve() |
client.endpoints.get() |
| Run code | client.code_interpreter.execute() |
client.code_interpreter.run() |
| File content | client.files.content() |
client.files.retrieve_content() |
| Evaluations | client.evals.create() |
client.evaluation.create() |
| Batch input | input_file_id= |
file_id= |
| Audio files | with open(path, "rb") as f: then pass f |
pass file path string |
| Autoscaling | autoscaling={"min_replicas": N, "max_replicas": M} |
min_replicas=N, max_replicas=M |
- Frontmatter descriptions should route by user intent, not read like marketing copy
SKILL.mdshould tell the agent when to open a specific reference or run a specific script- Avoid generic folder links such as
See [scripts/](scripts/); link to the exact script - Keep overlapping skills explicit about hand-off boundaries
- Maintain trigger eval sets in
quality/trigger-evals/
- Use ATX headings (
##not underlines) - Code blocks must specify language (
python,bash, ```json) - Use tables for parameter lists and model comparisons
- Keep lines under 120 characters where practical
- No emojis in SKILL.md files
Before committing changes, validate each modified skill:
python scripts/quick_validate.py skills/together-<skill>The validator checks:
- YAML frontmatter exists and parses correctly
nameis present, kebab-case, max 64 charsdescriptionis present, no angle brackets, max 1024 chars- No disallowed frontmatter keys
- Referenced files in
references/andscripts/exist
And python scripts/quality_check.py warns on:
- oversized
SKILL.mdfiles - long references without a TOC
- missing
agents/openai.yaml - generic
scripts/links - unsafe tempfile usage in Python scripts
- missing trigger eval sets
- Create
skills/together-<product>/SKILL.mdwith frontmatter and body - Add
references/files for detailed specs (model tables, API params) - Add
scripts/with runnable Python v2 SDK examples if the skill involves multi-step workflows - Create
agents/openai.yamlwithdisplay_name,short_description, anddefault_prompt - Validate with
python scripts/quick_validate.py skills/together-<product> - Run
./scripts/publish.shto regenerate AGENTS.md and README.md - Update
.claude-plugin/marketplace.jsonwith the new skill entry
- Read the full SKILL.md before making changes
- Keep inline examples minimal — move detailed content to
references/ - If updating SDK code, ensure it follows v2 patterns (see table above)
- If a model is deprecated, remove it from the model tables in
references/ - Test any script changes by reviewing the code (scripts require a Together API key to actually run)
Model tables live in references/models.md (or similar) within each skill. Update the table rows. Do not change the table structure unless adding a new column that all rows need.
- Create
skills/together-<skill>/scripts/<descriptive_name>.py - Follow the script conventions above (docstring,
__main__, type hints) - Add a reference line to the
## Resourcessection of the skill'sSKILL.md:- **Runnable script**: See [scripts/<name>.py](scripts/<name>.py) — short description (v2 SDK)
If a Together API changes, update in this order:
- The
SKILL.mdinline examples - The
references/docs - The
scripts/files - This
AGENTS.mdif the v2 SDK patterns table needs updating
- Add
README.md,CHANGELOG.md, orINSTALLATION_GUIDE.mdinside individual skill directories — the Agent Skills spec forbids extraneous docs within skills - Use angle brackets in any
descriptionfrontmatter field - Use v1 SDK method names in any code
- Add dependencies beyond
togetherto scripts without noting it in the docstring - Create empty
references/orscripts/directories — only include if they contain files