Skip to content

Commit fc31401

Browse files
hwchase17nhuang-lclnhsingh
authored
deepagents storage (#1150)
## Overview <!-- Brief description of what documentation is being added/updated --> ## Type of change **Type:** [Replace with: New documentation page / Update existing documentation / Fix typo/bug/link/formatting / Remove outdated content / Other] ## Related issues/PRs <!-- Link to related issues, feature PRs, or discussions (if applicable) To automatically close an issue when this PR is merged, use closing keywords: - "closes #123" or "fixes #123" or "resolves #123" For regular references without auto-closing, just use: - "#123" or "See issue #123" Examples: - closes #456 (will auto-close issue #456 when PR is merged) - See #789 for context (will reference but not auto-close issue #789) --> - GitHub issue: - Feature PR: <!-- For LangChain employees, if applicable: --> - Linear issue: - Slack thread: ## Checklist <!-- Put an 'x' in all boxes that apply --> - [ ] I have read the [contributing guidelines](README.md) - [ ] I have tested my changes locally using `docs dev` - [ ] All code examples have been tested and work correctly - [ ] I have used **root relative** paths for internal links - [ ] I have updated navigation in `src/docs.json` if needed - I have gotten approval from the relevant reviewers - (Internal team members only / optional) I have created a preview deployment using the [Create Preview Branch workflow](https://github.com/langchain-ai/docs/actions/workflows/create-preview-branch.yml) ## Additional notes <!-- Any other information that would be helpful for reviewers --> --------- Co-authored-by: Nick Huang <[email protected]> Co-authored-by: Lauren Hirata Singh <[email protected]>
1 parent e96586a commit fc31401

File tree

3 files changed

+471
-0
lines changed

3 files changed

+471
-0
lines changed

src/docs.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,12 +268,14 @@
268268
"group": "Get started",
269269
"pages": [
270270
"oss/python/deepagents/quickstart",
271+
"oss/python/deepagents/harness",
271272
"oss/python/deepagents/customization"
272273
]
273274
},
274275
{
275276
"group": "Core capabilities",
276277
"pages": [
278+
"oss/python/deepagents/backends",
277279
"oss/python/deepagents/subagents",
278280
"oss/python/deepagents/human-in-the-loop",
279281
"oss/python/deepagents/long-term-memory"

src/oss/deepagents/backends.mdx

Lines changed: 298 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,298 @@
1+
---
2+
title: Backends
3+
description: Choose and configure filesystem backends for deep agents. You can specify routes to different backends, implement virtual filesystems, and enforce policies.
4+
---
5+
6+
Deep agents expose a filesystem surface to the agent via tools like `ls`, `read_file`, `write_file`, `edit_file`, `glob`, and `grep`. These tools operate through a pluggable backend.
7+
8+
This page explains how to [choose a backend](#specify-a-backend), [route different paths to different backends](#route-to-different-backends), [implement your own virtual filesystem](#use-a-virtual-filesystem) (e.g., S3 or Postgres), [add policy hooks](#add-policy-hooks), and [comply with the backend protocol](#protocol-reference).
9+
10+
## Quickstart
11+
12+
Here are a few pre-built filesystem backends that you can quickly use with your deep agent:
13+
14+
| Built-in backend | Description |
15+
|---|---|
16+
| [Default](#statebackend-ephemeral) | `agent = create_deep_agent()` <br></br> Ephemeral in state. The default filesystem backend for an agent is stored in `langgraph` state. Note that this filesystem only persists _for a single thread_. |
17+
| [Local filesystem persistence](#filesystembackend-local-disk) | `agent = create_deep_agent(backend=FilesystemBackend(root_dir="/Users/nh/Desktop/"))` <br></br>This gives the deep agent access to your local machine's filesystem. You can specify the root directory that the agent has access to. Note that any provided `root_dir` must be an absolute path. |
18+
| [Durable store (LangGraph store)](#storebackend-langgraph-store) | `agent = create_deep_agent(backend=lambda rt: StoreBackend(rt))` <br></br>This gives the agent access to long-term storage that is _persisted across threads_. This is great for storing longer term memories or instructions that are applicable to the agent over multiple executions. |
19+
| [Composite](#compositebackend-router) | Ephemeral by default, `/memories/` persisted. The Composite backend is maximally flexible. You can specify different routes in the filesystem to point towards different backends. See Composite routing below for a ready-to-paste example. |
20+
21+
22+
## Built-in backends
23+
24+
### StateBackend (ephemeral)
25+
26+
:::python
27+
```python
28+
# By default we provide a StateBackend
29+
agent = create_deep_agent()
30+
31+
# Under the hood, it looks like
32+
from deepagents.backends import StateBackend
33+
34+
agent = create_deep_agent(
35+
backend=(lambda rt: StateBackend(rt)) # Note that the tools access State through the runtime.state
36+
)
37+
```
38+
:::
39+
40+
**How it works:**
41+
- Stores files in LangGraph agent state for the current thread.
42+
- Persists across multiple agent turns on the same thread via checkpoints.
43+
44+
**Best for:**
45+
- A scratch pad for the agent to write intermediate results.
46+
- Aautomatic eviction of large tool outputs which the agent can then read back in piece by piece.
47+
48+
### FilesystemBackend (local disk)
49+
50+
```python
51+
from deepagents.backends import FilesystemBackend
52+
53+
agent = create_deep_agent(
54+
backend=FilesystemBackend(root_dir="/Users/nh/Desktop/")
55+
)
56+
```
57+
58+
**How it works:**
59+
- Reads/writes real files under a configurable `root_dir`.
60+
- Note: `root_dir` must be an absolute path.
61+
- You can optionally set `virtual_mode=True` to sandbox and normalize paths under `root_dir`.
62+
- Uses secure path resolution, prevents unsafe symlink traversal when possible, can use ripgrep for fast `grep`.
63+
64+
**Best for:**
65+
- Local projects on your machine
66+
- CI sandboxes
67+
- Mounted persistent volumes
68+
69+
### StoreBackend (LangGraph Store)
70+
71+
```python
72+
from deepagents.backends import StoreBackend
73+
74+
agent = create_deep_agent(
75+
backend=(lambda rt: StoreBackend(rt)) # Note that the tools access Store through the runtime.store
76+
)
77+
```
78+
79+
**How it works:**
80+
- Stores files in a LangGraph `BaseStore` provided by the runtime, enabling cross‑thread durable storage.
81+
82+
**Best for:**
83+
- When you already run with a configured LangGraph store (for example, Redis, Postgres, or cloud implementations behind `BaseStore`).
84+
- When you're deploying your agent through LangSmith Deployments (a store is automatically provisioned for your agent).
85+
86+
87+
### CompositeBackend (router)
88+
89+
:::python
90+
```python
91+
from deepagents import create_deep_agent
92+
from deepagents.backends import FilesystemBackend
93+
from deepagents.backends.composite import build_composite_state_backend
94+
95+
composite_backend = lambda rt: CompositeBackend(
96+
default=StateBackend(rt)
97+
routes={
98+
"/memories/": StoreBackend(rt),
99+
"/docs/": CustomBackend()
100+
}
101+
)
102+
103+
agent = create_deep_agent(backend=composite_backend)
104+
```
105+
:::
106+
107+
**How it works:**
108+
- Routes file operations to different backends based on path prefix.
109+
- Preserves the original path prefixes in listings and search results.
110+
111+
**Best for:**
112+
- When you want to give your agent both ephemeral and cross-thread storage, a CompositeBackend allows you provide both a StateBackend and StoreBackend
113+
- When you have multiple sources of information that you want to provide to your agent as part of a single filesystem.
114+
- e.g. You have long-term memories stored under /memories/ in one Store and you also have a custom backend that has documentation accessible at /docs/.
115+
116+
## Specify a backend
117+
118+
- Pass a backend to `create_deep_agent(backend=...)`. The filesystem middleware uses it for all tooling.
119+
- You can pass either:
120+
- An instance implementing `BackendProtocol` (for example, `FilesystemBackend(root_dir=".")`), or
121+
- A factory `BackendFactory = Callable[[ToolRuntime], BackendProtocol]` (for backends that need runtime like `StateBackend` or `StoreBackend`).
122+
- If omitted, the default is `lambda rt: StateBackend(rt)`.
123+
124+
125+
## Route to different backends
126+
127+
Route parts of the namespace to different backends. Commonly used to persist `/memories/*` and keep everything else ephemeral.
128+
129+
:::python
130+
```python
131+
from deepagents import create_deep_agent
132+
from deepagents.backends import FilesystemBackend
133+
from deepagents.backends.composite import build_composite_state_backend
134+
135+
composite_backend = lambda rt: CompositeBackend(
136+
routes={
137+
"/memories/": FilesystemBackend(root_dir="/deepagents/myagent"),
138+
},
139+
)
140+
141+
agent = create_deep_agent(backend=composite_backend)
142+
```
143+
:::
144+
145+
Behavior:
146+
- `/workspace/plan.md` → StateBackend (ephemeral)
147+
- `/memories/agent.md` → FilesystemBackend under `/deepagents/myagent`
148+
- `ls`, `glob`, `grep` aggregate results and show original path prefixes.
149+
150+
Notes:
151+
- Longer prefixes win (for example, route `"/memories/projects/"` can override `"/memories/"`).
152+
- For StoreBackend routing, ensure the agent runtime provides a store (`runtime.store`).
153+
154+
## Use a virtual filesystem
155+
156+
Build a custom backend to project a remote or database filesystem (e.g., S3 or Postgres) into the tools namespace.
157+
158+
Design guidelines:
159+
160+
- Paths are absolute (`/x/y.txt`). Decide how to map them to your storage keys/rows.
161+
- Implement `ls_info` and `glob_info` efficiently (server-side listing where available, otherwise local filter).
162+
- Return user-readable error strings for missing files or invalid regex patterns.
163+
- For external persistence, set `files_update=None` in results; only in-state backends should return a `files_update` dict.
164+
165+
S3-style outline:
166+
167+
:::python
168+
```python
169+
from deepagents.backends.protocol import BackendProtocol, WriteResult, EditResult
170+
from deepagents.backends.utils import FileInfo, GrepMatch
171+
172+
class S3Backend(BackendProtocol):
173+
def __init__(self, bucket: str, prefix: str = ""):
174+
self.bucket = bucket
175+
self.prefix = prefix.rstrip("/")
176+
177+
def _key(self, path: str) -> str:
178+
return f"{self.prefix}{path}"
179+
180+
def ls_info(self, path: str) -> list[FileInfo]:
181+
# List objects under _key(path); build FileInfo entries (path, size, modified_at)
182+
...
183+
184+
def read(self, file_path: str, offset: int = 0, limit: int = 2000) -> str:
185+
# Fetch object; return numbered content or an error string
186+
...
187+
188+
def grep_raw(self, pattern: str, path: str | None = None, glob: str | None = None) -> list[GrepMatch] | str:
189+
# Optionally filter server‑side; else list and scan content
190+
...
191+
192+
def glob_info(self, pattern: str, path: str = "/") -> list[FileInfo]:
193+
# Apply glob relative to path across keys
194+
...
195+
196+
def write(self, file_path: str, content: str) -> WriteResult:
197+
# Enforce create‑only semantics; return WriteResult(path=file_path, files_update=None)
198+
...
199+
200+
def edit(self, file_path: str, old_string: str, new_string: str, replace_all: bool = False) -> EditResult:
201+
# Read → replace (respect uniqueness vs replace_all) → write → return occurrences
202+
...
203+
```
204+
:::
205+
206+
Postgres-style outline:
207+
208+
- Table `files(path text primary key, content text, created_at timestamptz, modified_at timestamptz)`
209+
- Map tool operations onto SQL:
210+
- `ls_info` uses `WHERE path LIKE $1 || '%'`
211+
- `glob_info` filter in SQL or fetch then apply glob in Python
212+
- `grep_raw` can fetch candidate rows by extension or last modified time, then scan lines
213+
214+
## Add policy hooks
215+
216+
Enforce enterprise rules by subclassing or wrapping a backend.
217+
218+
Block writes/edits under selected prefixes (subclass):
219+
220+
:::python
221+
```python
222+
from deepagents.backends.filesystem import FilesystemBackend
223+
from deepagents.backends.protocol import WriteResult, EditResult
224+
225+
class GuardedBackend(FilesystemBackend):
226+
def __init__(self, *, deny_prefixes: list[str], **kwargs):
227+
super().__init__(**kwargs)
228+
self.deny_prefixes = [p if p.endswith("/") else p + "/" for p in deny_prefixes]
229+
230+
def write(self, file_path: str, content: str) -> WriteResult:
231+
if any(file_path.startswith(p) for p in self.deny_prefixes):
232+
return WriteResult(error=f"Writes are not allowed under {file_path}")
233+
return super().write(file_path, content)
234+
235+
def edit(self, file_path: str, old_string: str, new_string: str, replace_all: bool = False) -> EditResult:
236+
if any(file_path.startswith(p) for p in self.deny_prefixes):
237+
return EditResult(error=f"Edits are not allowed under {file_path}")
238+
return super().edit(file_path, old_string, new_string, replace_all)
239+
```
240+
:::
241+
242+
Generic wrapper (works with any backend):
243+
244+
:::python
245+
```python
246+
from deepagents.backends.protocol import BackendProtocol, WriteResult, EditResult
247+
from deepagents.backends.utils import FileInfo, GrepMatch
248+
249+
class PolicyWrapper(BackendProtocol):
250+
def __init__(self, inner: BackendProtocol, deny_prefixes: list[str] | None = None):
251+
self.inner = inner
252+
self.deny_prefixes = [p if p.endswith("/") else p + "/" for p in (deny_prefixes or [])]
253+
254+
def _deny(self, path: str) -> bool:
255+
return any(path.startswith(p) for p in self.deny_prefixes)
256+
257+
def ls_info(self, path: str) -> list[FileInfo]:
258+
return self.inner.ls_info(path)
259+
def read(self, file_path: str, offset: int = 0, limit: int = 2000) -> str:
260+
return self.inner.read(file_path, offset=offset, limit=limit)
261+
def grep_raw(self, pattern: str, path: str | None = None, glob: str | None = None) -> list[GrepMatch] | str:
262+
return self.inner.grep_raw(pattern, path, glob)
263+
def glob_info(self, pattern: str, path: str = "/") -> list[FileInfo]:
264+
return self.inner.glob_info(pattern, path)
265+
def write(self, file_path: str, content: str) -> WriteResult:
266+
if self._deny(file_path):
267+
return WriteResult(error=f"Writes are not allowed under {file_path}")
268+
return self.inner.write(file_path, content)
269+
def edit(self, file_path: str, old_string: str, new_string: str, replace_all: bool = False) -> EditResult:
270+
if self._deny(file_path):
271+
return EditResult(error=f"Edits are not allowed under {file_path}")
272+
return self.inner.edit(file_path, old_string, new_string, replace_all)
273+
```
274+
:::
275+
276+
## Protocol reference
277+
278+
Backends must implement the `BackendProtocol`.
279+
280+
Required endpoints:
281+
- `ls_info(path: str) -> list[FileInfo]`
282+
- Return entries with at least `path`. Include `is_dir`, `size`, `modified_at` when available. Sort by `path` for deterministic output.
283+
- `read(file_path: str, offset: int = 0, limit: int = 2000) -> str`
284+
- Return numbered content. On missing file, return `"Error: File '/x' not found"`.
285+
- `grep_raw(pattern: str, path: Optional[str] = None, glob: Optional[str] = None) -> list[GrepMatch] | str`
286+
- Return structured matches. For an invalid regex, return a string like `"Invalid regex pattern: ..."` (do not raise).
287+
- `glob_info(pattern: str, path: str = "/") -> list[FileInfo]`
288+
- Return matched files as `FileInfo` entries (empty list if none).
289+
- `write(file_path: str, content: str) -> WriteResult`
290+
- Create-only. On conflict, return `WriteResult(error=...)`. On success, set `path` and for state backends set `files_update={...}`; external backends should use `files_update=None`.
291+
- `edit(file_path: str, old_string: str, new_string: str, replace_all: bool = False) -> EditResult`
292+
- Enforce uniqueness of `old_string` unless `replace_all=True`. If not found, return error. Include `occurrences` on success.
293+
294+
Supporting types:
295+
- `WriteResult(error, path, files_update)`
296+
- `EditResult(error, path, files_update, occurrences)`
297+
- `FileInfo` with fields: `path` (required), optionally `is_dir`, `size`, `modified_at`.
298+
- `GrepMatch` with fields: `path`, `line`, `text`.

0 commit comments

Comments
 (0)