-
Notifications
You must be signed in to change notification settings - Fork 9
Datasystem readme #56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,157 @@ | ||||||
| # openYuanrong datasystem 快速使用指南 | ||||||
|
|
||||||
| ## 概述 | ||||||
| openYuanrong datasystem 是一个分布式缓存系统,利用计算集群的 HBM/DRAM/SSD 资源构建近计算多级缓存,提升模型训练及推理、大数据、微服务等场景数据访问性能。 | ||||||
|
|
||||||
| ## 环境要求 | ||||||
| 操作系统:openEuler 22.03 或更高版本 | ||||||
| CANN:8.2.rc1 或更高版本 | ||||||
| Python:3.9–3.11 | ||||||
| etcd:3.5.12 或更高版本 | ||||||
|
|
||||||
| ## 部署 etcd | ||||||
| ### 安装 | ||||||
| 1. 下载二进制包(参考 [etcd GitHub Releases](https://github.com/etcd-io/etcd/releases)): | ||||||
| ```bash | ||||||
| ETCD_VERSION="v3.5.12" | ||||||
| wget https://github.com/etcd-io/etcd/releases/download/${ETCD_VERSION}/etcd-${ETCD_VERSION}-linux-amd64.tar.gz | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cpu架构也用变量替换吧,同时考虑x86和arm |
||||||
| ``` | ||||||
| 2. 解压并安装: | ||||||
| ```bash | ||||||
| tar -xvf etcd-${ETCD_VERSION}-linux-amd64.tar.gz | ||||||
| cd etcd-${ETCD_VERSION}-linux-amd64 | ||||||
| sudo cp etcd etcdctl /usr/local/bin/ | ||||||
| ``` | ||||||
| 3. 验证安装: | ||||||
| ```bash | ||||||
| etcd --version | ||||||
| etcdctl version | ||||||
| ``` | ||||||
| 如果能输出版本号说明安装成功。 | ||||||
|
|
||||||
| ### 启动集群 | ||||||
| > 提示:以下为最小化单节点部署示例。生产环境请参考 [官方集群部署文档](https://etcd.io/docs/current/op-guide/clustering/)。 | ||||||
| 1. 启动单节点 etcd 集群,并设置任意空闲端口(如 2379 和 2380 ): | ||||||
|
||||||
| 1. 启动单节点 etcd 集群,并设置任意空闲端口(如 2379 和 2380 ): | |
| 1. 启动单节点 etcd 集群,并设置任意空闲端口(如 2379 和 2380): |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example etcd startup command binds client and peer URLs to http://0.0.0.0 without TLS or authentication, which exposes the key-value store to any host that can reach this machine and can lead to unauthorized reads/writes of cluster metadata and potentially sensitive data. An attacker on the same network could directly interact with etcd on ports 2379/2380 using etcdctl or raw HTTP. For safer defaults, restrict --listen-client-urls/--listen-peer-urls to 127.0.0.1 or a secured interface and document enabling TLS and authentication for non-local or production use.
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add space between "节点" and "peerURL" for better readability. Should be "节点名=节点 peerURL" instead of "节点名=节点peerURL".
| - `--initial-cluster`:初始节点列表,格式:节点名=节点peerURL。 | |
| - `--initial-cluster`:初始节点列表,格式:节点名=节点 peerURL。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面etcd是x86的,这个地方datasystem是arm的,肯定装不起来,和etcd一样把cpu架构使用变量替换
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The placeholder "${WORKER_IP_N}" with "N" suffix suggests multiple workers, but the instruction says "在每个节点启动一个监听端口号为 31501 的服务端进程" (start one server process on each node). It's unclear if multiple worker processes should run on the same node with different IPs or if each physical node runs one worker. Consider clarifying whether N represents different physical nodes or multiple workers per node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
再加一段,说清楚为啥EC用mooncake就行了,原理是啥,直觉上应该和kvc一样用yuanrong connector。
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Prefill-Decoder node configuration is identical to the Encoder node configuration, both using "ec_producer" role. In a 1E1PD architecture, the Prefill-Decoder should typically be a consumer of the Encoder's output. The ec_role should likely be "ec_consumer" instead of "ec_producer".
| "ec_role": "ec_producer" | |
| "ec_role": "ec_consumer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其他操作系统不行吗?一定是openEuler吗?