同步 MMKG 流水线文档与算子真实行为 by W-RMSL · Pull Request #32 · ZhP-Li197/DataFlow-KG-Doc

W-RMSL · 2026-05-11T06:44:56Z

No description provided.

Copilot

Pull request overview

This PR updates the Multimodal KG pipeline guide (ZH/EN) to better reflect the current MMKG pipeline’s expected inputs/outputs and example data, including sample image paths and updated visual-triple examples.

Changes:

Add sample image references and update img_dict guidance for local image paths.
Update the documented visual triple (vis_triple) format and example IO payloads.
Document vis_url alongside the input schema and update example JSON accordingly.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.

File	Description
docs/zh/notes/kg_guide/kg_pipelines_by_types/multimodal_kg_pipeline.md	Updates ZH multimodal pipeline guide’s sample assets, input schema, and visual triple examples.
docs/en/notes/kg_guide/kg_pipelines_by_types/multimodal_kg_pipeline.md	Updates EN multimodal pipeline guide’s sample assets, input schema, and visual triple examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 This pipeline requires at least the following fields:

 - **raw_chunk**: source text for entity and textual triple extraction
- **img_dict**: an image dictionary where keys are image IDs and values are local image paths
+- **img_dict**: an image dictionary where keys are image IDs (free-form strings that appear in `vis_triple`) and values are local image paths
+- **vis_url**: a list of image paths the VLM opens during step 5 QA generation. Its order must match the order in which image IDs first appear in `img_dict`.




 - combine images and candidate entities to extract visual facts
- typically output triples in the form `"<subj> entity <rel> depicted_in <obj> image_id"`
+- output triples in the form `"<subj> entity <obj> image_id <rel> depicted_in "`


+    "<subj> Cybertruck <obj> img_cybertruck <rel> depicted_in ",
+    "<subj> Elon Musk <obj> img_musk_stage <rel> depicted_in ",
+    "<subj> Cybertruck <obj> img_musk_stage <rel> depicted_in "


+- Sample images: `example_data/MultimodalKGPipeline/images/cyber.jpg`, `example_data/MultimodalKGPipeline/images/musk.jpg`

-For real image-text workloads, values in `img_dict` must be valid local image paths. The default JSON file is provided as a runnable input structure.
+Values in `img_dict` are the actual paths the VLM serving layer opens. Only **local paths** are supported today: the serving layer calls `open(path, "rb")` and base64-encodes the bytes into the request, and it does not fetch remote URLs. The default data ships with runnable example images, and the paths are written relative to the `api_pipelines/` directory (the CWD when you run `python multimodal_kg_pipeline.py`).


 该流水线至少需要以下字段：

 - **raw_chunk**：原始文本，用于实体和文本三元组抽取。
- **img_dict**：图片字典，key 为图片 ID，value 为本地图片路径。
+- **img_dict**：图片字典，key 为图片 ID（自定义字符串，会出现在 `vis_triple` 中），value 为本地图片路径。
+- **vis_url**：图片路径列表，供 step5 QA 生成时打开图片传给 VLM。元素顺序需要与 `img_dict` 中图片首次出现的顺序一致。




 - 结合图片与候选实体抽取视觉事实
- 输出格式通常为 `"<subj> 实体 <rel> depicted_in <obj> 图片ID"`
+- 输出格式为 `"<subj> 实体 <obj> 图片ID <rel> depicted_in "`


+    "<subj> Cybertruck <obj> img_cybertruck <rel> depicted_in ",
+    "<subj> Elon Musk <obj> img_musk_stage <rel> depicted_in ",
+    "<subj> Cybertruck <obj> img_musk_stage <rel> depicted_in "


+- 示例图片：`example_data/MultimodalKGPipeline/images/cyber.jpg`、`example_data/MultimodalKGPipeline/images/musk.jpg`

-真实图文场景中，`img_dict` 的 value 需要是本地可访问的图片路径；默认数据使用 JSON 格式，可直接作为结构示例。
+`img_dict` 的 value 是 VLM 实际打开的图片路径，目前仅支持**本地路径**（serving 层用 `open(path, "rb")` 读字节再 base64 编码内联到请求里，不会自动下载远程 URL）。默认数据已随包附带可直接运行的示例图，路径按 `python multimodal_kg_pipeline.py` 在 `api_pipelines/` 目录下的相对位置给出。


+- Sample images: `example_data/MultimodalKGPipeline/images/cyber.jpg`, `example_data/MultimodalKGPipeline/images/musk.jpg`

-For real image-text workloads, values in `img_dict` must be valid local image paths. The default JSON file is provided as a runnable input structure.
+Values in `img_dict` are the actual paths the VLM serving layer opens. Only **local paths** are supported today: the serving layer calls `open(path, "rb")` and base64-encodes the bytes into the request, and it does not fetch remote URLs. The default data ships with runnable example images, and the paths are written relative to the `api_pipelines/` directory (the CWD when you run `python multimodal_kg_pipeline.py`).


+- 示例图片：`example_data/MultimodalKGPipeline/images/cyber.jpg`、`example_data/MultimodalKGPipeline/images/musk.jpg`

-真实图文场景中，`img_dict` 的 value 需要是本地可访问的图片路径；默认数据使用 JSON 格式，可直接作为结构示例。
+`img_dict` 的 value 是 VLM 实际打开的图片路径，目前仅支持**本地路径**（serving 层用 `open(path, "rb")` 读字节再 base64 编码内联到请求里，不会自动下载远程 URL）。默认数据已随包附带可直接运行的示例图，路径按 `python multimodal_kg_pipeline.py` 在 `api_pipelines/` 目录下的相对位置给出。


同步 MMKG 流水线文档与算子真实行为

796ee05

Copilot AI review requested due to automatic review settings May 11, 2026 06:44

Copilot started reviewing on behalf of W-RMSL May 11, 2026 06:45 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

同步 MMKG 流水线文档与算子真实行为#32

同步 MMKG 流水线文档与算子真实行为#32
W-RMSL wants to merge 1 commit into
mainfrom
docs/mmkg-pipeline-update

W-RMSL commented May 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

W-RMSL commented May 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants