Get your coding agent to batch-process lots of files (documents, spreadsheets, images, scanned PDFs) through the Doubleword API, cheaply and asynchronously with 50-85% savings compared to real-time LLMs. You give your agent what files you want processed plus a prompt, it creates a batch, submits it, then polls and saves results back to disk.
I built this for Claude Code and it works very well there when Claude is the model. I also tested it with GLM-4.7 running on Ollama as the driving model; it worked, but the overall experience was slower.
All you need to know to get started is in this README and the SKILLS.md and GUIDE.md is really for agents to read
- Bulk summarization and analysis across many files in one go.
- Image captioning and OCR (including receipts/invoices) when using a vision-capable model.
- Structured extraction (for example “return JSON with these fields”).
- Asynchronous batching that is usually much cheaper than doing the same work via synchronous calls.
Typical “yes use this” jobs:
- “Summarize these 50 PDFs.”
- “Extract vendor, date, total, and line items from these 200 receipt images.”
- “Caption these product photos.”
- “Pull key metrics out of these CSV files.”
git clone https://github.com/NnamdiOdozi/dw_batch_skill
cd dw_batch_skillYou are copying the folder named dw_batch (or dw_batch, into the “skills” directory, either at project or user level .
Pick one of these patterns:
This makes the skill available to all Claude Code sessions.
mkdir -p ~/.claude/skills
cp -R dw_batch ~/.claude/skills/This makes the skill available only inside one repo/project.
mkdir -p .claude/skills
cp -r dw_batch .claude/skills/If you are using a different harness (not Claude Code), put the folder wherever that harness expects “skills”. The important thing is that the folder contains SKILL.md plus the scripts.
- Create your local secrets file from the sample:
cp .env.dw.sample .env.dw- Edit
.env.dwand set:
DOUBLEWORD_AUTH_TOKEN=sk-your-doubleword-keyGet a Doubleword key from the Doubleword portal. https://www.doubleword.ai/
Security note: .env.dw should never be committed to Git. Keep it local.
Check that you are happy with the defaults in the config.toml file eg LLM model, SLA, Polling Interval etc
**dw_batch:** Async batch processing using Doubleword API. Process multiple PDFs/DOCXs cost-effectively (50-85% cheaper) for non-urgent summarization, analysis, OCR, and extraction tasks. **Suggest for any token-heavy non-urgent work (10K+ tokens).**
You just need to say "dw_batch" this task,together with a prompt that includes how long in words or tokens the output should be etc and the agent will load the skill and then take care of the rest, set config etc. Sometimes the agent will detect a suitable set of tasks for batching and prompt you about it.
Results will be written under:
dw_batch_output/for final outputsdw_batch_output/logs/for batch request artifacts and batch IDs
Doubleword API is OpenAI-compatible. If you want to point this skill at OpenAI (or another compatible gateway), change the base URL and model in the config file (typically config.toml or dw_batch/config.toml):
- Set
base_urlto your provider’s API base. - Set
modelto a model name your provider supports. - Put that provider’s key in
.env.dw(same variable name, this skill just needs a token).
That is it. The request format remains the same.
There are quite a few built-in safety featues eg:
-
Cost guardrails: the skill enforces input and output token thresholds (default examples are 250K input tokens or 100K output tokens) and will stop and warn before an expensive batch runs.
-
Dry-run mode: lets the agent estimate scope and cost before submitting anything to the provider.
-
Safe failure mode: if one file fails extraction (unsupported, too short, corrupted), the batch creation logs it and continues with the remaining files rather than silently failing everything
-
Resumable polling: stopping polling (Ctrl+C) does not cancel the remote batch; you can restart polling later to fetch results, which avoids “panic re-submit” double-spend.
I only tried the Skill with Claude Code. If you want to try it with other or even multiple agent harnesses then my advice would be to use symlinks so that they all point to the same skills folder
SKILL.mdfor how the agent should invoke this skill.GUIDE.mdfor deeper usage patterns and troubleshooting.examples.mdfor ready-to-run prompts (receipts, multimodal, OCR).
implement local queue and jobs manifest sleeper agent that wakes up every 30mins to work on local queue
This is provided as-is. If you hit issues, open a GitHub issue with:
- your command, your OS, and the error text
- a description of the file types you processed (no sensitive content)