Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Error Codes

错误码定义和分类,用于错误处理和重试策略。

## 使用示例

```python
import rock

def test_codes_values():
"""测试基本状态码值"""
assert rock.codes.OK == 2000
assert rock.codes.BAD_REQUEST == 4000
assert rock.codes.INTERNAL_SERVER_ERROR == 5000
assert rock.codes.COMMAND_ERROR == 6000
```

## Codes 分类

```python
OK = 2000, "OK"
"""
成功状态码 (2xxx)
"""

BAD_REQUEST = 4000, "Bad Request"
"""
客户端错误码 (4xxx):

这些错误表示客户端请求有问题,
SDK 会抛出异常。
"""

INTERNAL_SERVER_ERROR = 5000, "Internal Server Error"
"""
服务端错误码 (5xxx):

这些错误表示服务端出现问题,
SDK 会抛出异常。
"""

COMMAND_ERROR = 6000, "Command Error"
"""
命令/执行错误码 (6xxx):

这些错误与命令执行相关,由模型处理,
SDK 不会抛出异常。
"""
```

## 重试策略建议

- **重试触发条件**: 只有当 `INTERNAL_SERVER_ERROR` 时才需要重试
- **其他情况的处理策略**:
- `BAD_REQUEST`: 需要检查 arun 调用逻辑是否有异常
- `COMMAND_ERROR`: stdout 输出到 `observation.output`,stderr 输出到 `observation.failure_reason`
- `COMMAND_ERROR` 说明: 由于 bash 执行失败时,stdout/stderr 可能全部非空,建议将 observation 中 output 和 failure_reason 全部 prompt 给模型进行推理

## 重试示例

```python
# Background execution with nohup
while retry_times < retry_limit:
try:
observation: Observation = await sandbox.arun(
"python long_running_script.py",
mode="nohup"
)
if observation.exit_code != 0:
logging.warning(
f"Command failed with exit code {observation.exit_code}, "
f"output: {observation.output}, failure_reason: {observation.failure_reason}"
)
return observation
except RockException as e:
if rock.codes.is_server_error(e.code):
if retry_times >= retry_limit:
logging.error(f"All {retry_limit} attempts failed")
raise e
else:
retry_times += 1
logging.error(
f"Server error occurred, code: {e.code}, message: {e.code.get_reason_phrase()}, "
f"exception: {str(e)}, will retry, times: {retry_times}."
)
await asyncio.sleep(2)
continue
else:
logging.error(
f"Non-retriable error occurred, code: {e.code}, message: {e.code.get_reason_phrase()}, exception: {str(e)}."
)
raise e
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Remote User

远程用户管理,用于在沙箱中创建和管理用户。

## 使用示例

```python
import asyncio
from rock.sdk.sandbox.config import SandboxConfig
from rock.sdk.sandbox.client import Sandbox

from rock.actions import Action, CreateBashSessionRequest, Observation

async def test_remote_user():
config = SandboxConfig(
image='hub.docker.alibaba-inc.com/chatos/python:3.11',
xrl_authorization='xxx',
cluster='nt-c'
)
sandbox = Sandbox(config)
await sandbox.start()

await sandbox.remote_user.create_remote_user('rock')
assert await sandbox.remote_user.is_user_exist('rock')
print('test remote user success')

async def test_create_session_with_remote_user():
config = SandboxConfig(
image='hub.docker.alibaba-inc.com/chatos/python:3.11',
xrl_authorization='xxx',
cluster='nt-c'
)
sandbox = Sandbox(config)
await sandbox.start()

await sandbox.remote_user.create_remote_user('rock')
assert await sandbox.remote_user.is_user_exist('rock')

await sandbox.create_session(CreateBashSessionRequest(remote_user="rock", session="bash"))

observation: Observation = await sandbox.run_in_session(
action=Action(session="bash", command="whoami")
)
print(observation)
assert observation.output.strip() == "rock"
print('test create session with remote user success')

if __name__ == '__main__':
asyncio.run(test_remote_user())
asyncio.run(test_create_session_with_remote_user())
```

## API

### create_remote_user(username)

创建远程用户。

```python
await sandbox.remote_user.create_remote_user('username')
```

### is_user_exist(username)

检查用户是否存在。

```python
exists = await sandbox.remote_user.is_user_exist('username')
```
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Sandbox SDK 参考
# 处理大文件和长命令输出

## `arun`
`arun()` 在 `nohup` 模式下提供了两个关键参数,帮助 Agent / 调用方在“执行”与“查看”之间按需解耦:
`arun()` 在 `nohup` 模式下提供了两个关键参数,帮助 Agent / 调用方在"执行"与"查看"之间按需解耦:

1. **`response_limited_bytes_in_nohup`**(int 型)
1. **`response_limited_bytes_in_nohup`**(int 型)
限制返回内容的最大字符数(例如 `64 * 1024`),适合仍需立刻查看部分日志、但必须控制带宽的场景。默认值 `None` 表示不加限制。

2. **`ignore_output`**(bool,默认 `False`)
2. **`ignore_output`**(bool,默认 `False`)
当设为 `True` 时,`arun()` 不再读取 nohup 输出文件,而是在命令执行完毕后立即返回一段提示信息(包含输出文件路径、**文件大小**及查看方式)。日志仍写入 `/tmp/tmp_<timestamp>.out`,后续可通过 `read_file`、下载接口或自定义命令按需读取,实现"执行"与"查看"彻底解耦。返回的文件大小信息可帮助用户决定是直接下载还是分块读取。

```python
Expand Down Expand Up @@ -52,11 +52,62 @@ print(resp_detached.output)
```

## `read_file_by_line_range`
功能说明: 按行范围异步读取文件内容,支持自动分块读取和会话管理。主要特性包括大文件自动分块读取、自动统计文件总行数、内置重试机制(3次重试)、参数验证。以下是使用示例:

按行范围异步读取文件内容,支持自动分块读取和会话管理,支持大文件读取。

### 重要特性
- **大文件分块读取**: 自动将大文件分成多个小块进行读取
- **自动统计行数**: 未指定结束行时,自动计算文件总行数
- **内置重试机制**: 关键操作支持最多 3 次重试,提高可靠性
- **参数验证**: 自动验证输入参数的合法性
- **会话管理**: 支持指定会话或自动创建临时会话

### 参数说明
| 参数 | 类型 | 默认值 | 说明 |
|------|------|--------|------|
| `file_path` | str | - | 要读取的文件路径(沙箱中的绝对路径或相对路径) |
| `start_line` | int \| None | 1 | 起始行号(从 1 开始) |
| `end_line` | int \| None | None | 结束行号(包含),默认为文件末尾 |
| `lines_per_request` | int | 1000 | 每次请求读取的行数,范围 1-10000 |

### 返回值
- `ReadFileResponse`: 包含文件内容的响应对象
- `content` (str): 读取的文件内容

### 异常说明
- `Exception`: 当 `start_line < 1` 时抛出
- `Exception`: 当 `end_line < start_line` 时抛出
- `Exception`: 当 `lines_per_request` 不在 1-10000 范围内时抛出
- `Exception`: 当文件读取失败时抛出

### 使用示例

```python
# 读取整个文件
response = await read_file_by_line_range("example.txt")
response = await sandbox.read_file_by_line_range("/path/to/file.txt")

# 读取指定行范围(第 100 到 500 行)
response = await sandbox.read_file_by_line_range(
"/path/to/file.txt",
start_line=100,
end_line=500
)

# 从第 1990 行读取到文件末尾
response = await sandbox.read_file_by_line_range(
"/path/to/file.txt",
start_line=1990
)

# 使用自定义分块大小
response = await sandbox.read_file_by_line_range(
"/path/to/file.txt",
lines_per_request=5000
)
```

# 读取指定行范围
response = await read_file_by_line_range("example.txt", start_line=1, end_line=2000)
```
### 注意事项
- 行号从 1 开始计数,而非 0
- 对于大文件建议适当增加 `lines_per_request` 以提高效率
- 文件路径必须是沙箱内的有效路径
- 使用 `sed` 命令进行文件读取,确保沙箱镜像支持该命令
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Error Codes

Error code definitions and categories for error handling and retry strategies.

## Usage Example

```python
import rock

def test_codes_values():
"""Test basic status code values"""
assert rock.codes.OK == 2000
assert rock.codes.BAD_REQUEST == 4000
assert rock.codes.INTERNAL_SERVER_ERROR == 5000
assert rock.codes.COMMAND_ERROR == 6000
```

## Codes Categories

```python
OK = 2000, "OK"
"""
Success codes (2xxx)
"""

BAD_REQUEST = 4000, "Bad Request"
"""
Client error codes (4xxx):

These errors indicate issues with the client request,
SDK will raise Exceptions for these errors.
"""

INTERNAL_SERVER_ERROR = 5000, "Internal Server Error"
"""
Server error codes (5xxx):

These errors indicate issues on the server side,
SDK will raise Exceptions for these errors.
"""

COMMAND_ERROR = 6000, "Command Error"
"""
Command/execution error codes (6xxx):

These errors are related to command execution and should be handled by the model,
SDK will NOT raise Exceptions for these errors.
"""
```

## Retry Strategy Recommendations

- **Retry trigger**: Only retry when `INTERNAL_SERVER_ERROR` occurs
- **Other error handling**:
- `BAD_REQUEST`: Check if there are issues with the arun call logic
- `COMMAND_ERROR`: stdout goes to `observation.output`, stderr goes to `observation.failure_reason`
- `COMMAND_ERROR` note: When bash execution fails, both stdout and stderr may be non-empty. It is recommended to prompt the model with both output and failure_reason from the observation.

## Retry Example

```python
# Background execution with nohup
while retry_times < retry_limit:
try:
observation: Observation = await sandbox.arun(
"python long_running_script.py",
mode="nohup"
)
if observation.exit_code != 0:
logging.warning(
f"Command failed with exit code {observation.exit_code}, "
f"output: {observation.output}, failure_reason: {observation.failure_reason}"
)
return observation
except RockException as e:
if rock.codes.is_server_error(e.code):
if retry_times >= retry_limit:
logging.error(f"All {retry_limit} attempts failed")
raise e
else:
retry_times += 1
logging.error(
f"Server error occurred, code: {e.code}, message: {e.code.get_reason_phrase()}, "
f"exception: {str(e)}, will retry, times: {retry_times}."
)
await asyncio.sleep(2)
continue
else:
logging.error(
f"Non-retriable error occurred, code: {e.code}, message: {e.code.get_reason_phrase()}, exception: {str(e)}."
)
raise e
```
Loading